lunedì 20 giugno 2016

First week - Studying codes and some problems

Dear all,

In this first and half week, as written in the first post, I studied the algorithms and also their codes in Octave. I recall that the algorithms are gmres, pcg, bicg, bicgstab, qmr, cgs and pcr.

The codes that I used that are not in Octave can be found at
in the folder "First_week".

Octave codes

Studying the implementations I noticed problems in some of them:

  1. pcg - Trying to solve the simple problem 1*x = 1, with initial guess 1, there is the following output:

    octave4.1-cli:29> x = pcg(1,1,[ ],[ ],[ ],[ ],1)
    warning: division by zero
    warning: called from
        pcg at line 367 column 10
    pcg: converged in 0 iterations. pcg: the initial residual norm was reduced NaN times.
    x =  1

    We can see that in 0 iteration there is a warning for a division by zero and that the residual is reduced NaN times. Then this output text needs to be fixed.
  2. bicg - I tried to solve the small linear system A*x = b, with
    A = [7 7 8 4 4; 7 0 6 3 4; 3 2 3 7 6; 6 0 9 7 7; 1 0 0 1 7];
    b = sum(A,2);
    (I obtained this matrix with floor(10*rand(5)) so it is a simple integer matrix, and b computed as the sum of the columns of A in such a way the exact solution is the vector x = [1; 1; 1; 1; 1])
    With this configuration I discovered that the algorithm stops before the convergence, because of stagnation. Indeed the output is:

    octave4.1-cli:39> [x, flag, relres, iter, resvec] = bicg(A,b)
    bicg stopped at iteration 4 without converging to the desired tolerance 1.000000e-06
    because the method stagnated.
    The iterate returned (number 3) has relative residual 3.772464e-02
    x =


    flag =  3
    relres =  0.038052
    iter =  4
    resvec =

       0.070581   0.042765   0.037725

    As we can see the last residual (relres = 0.038052) is greater than the third (that is the third component of resvec, i.e. 0.037725). Then the algorithm detect a stagnation and stops.
    I tried to solve the same linear system with Matlab, and I noticed that also its bicg has the fourth iteration with the residual greater than the third one, but it doesn't stop and at the fifth iteration it converges at the solution.
    So I tried to comment the part of the stagnation (i.e. lines from 187 to 195 in the Octave's bicg) and tried to solve this linear system, obtaining:

    octave4.1-cli:41> [x,flag,relres,iter,resvec] = bicg_1(A,b)
    bicg stopped at iteration 5 without converging to the desired tolerance 1.000000e-06
    because the maximum number of iterations was reached. The iterate returned (number 5) has relative residual 3.371209e-14
    x =


    flag =  1
    relres =    3.3712e-14
    iter =  5
    resvec =

       0.070581   0.042765   0.037725   0.038052

    Except that an error text that says that the maximum number of iterations was reached, we can see that at the fifth iteration the algorithm converges with tolerance smaller than the default one 1e-6.
    The bicg with the "stagnation" commented is the file bigc_1.m, instead the matrix A and the test with bigc and bicg_1, can be found in example_bicg.m, in the folder First_week .
  3. cgs - I studied the algorithm from the Saad's book, and then I tried to study the codes, but I noticed that its Octave implementation was not taken from this book. Since in the help there is not the reference book I can't compare this implementation with the reference.
    I also tried to solve the same linear system discussed in the bicg section, and I noticed that the algorithm does not converge. The obtained output is:

    octave4.1-cli:56> [x,flag,relres,iter,resvec] = cgs(A,b)
    x =


    flag =  1
    relres =  0.17700
    iter =  5
    resvec =


    We can see that the solution given by the algorithm is far from the exact solution, also the flag equal to 1 indicates that the algorithm reached the maximum number of iterations without convergence.
    Then I tried to make a simple and basic implementation from scratch using the formulation in the Saad book, I tried to solve the linear system with this implementation and this implementation converges. The output is:

    octave4.1-cli:58> [x,resvec] = cgs_1(A,b)
    x =


    resvec =


    To be sure that is not a pathological case I tried to solve with the Octave implementation and mine the linear system A*x = b, with
    A = [6 6 8 7 2; 5 4 1 0 3; 4 1 2 6 3; 6 9 3 8 3; 3 5 6 9 4];
    b = sum(A,2);
    obtaining similar results.

    Then, after a discussion with the mentors, we decided that maybe it is easier to implement this method from scratch, instead to search the problems, due to the fact that there is no reference for its implementation.
    My simple and basic cgs implementation is the file cgs_1.m, and the test with the second matrix can be found in the file example_cgs.m, in the folder First_week.
  4. bicgstab - I noticed that in Matlab are displayed also the "half iteration" with the "half residual", but not in Octave. Then talking with the mentors we decided to display them also in Octave.

 Matrix-vector product in the algorithms

Since all of the algorithms of the project are solvers for linear systems, the matrix-vector product is essential in all of their implementation, but since the matrix A and the (eventual) preconditioners can be passed as matrices or as function handle, it is not easy to decide how to compute all these matrix-vector products.

For example in the gmres algorithm the strategy is to transform A, M1, M2 into function handle, i.e. (tanking A as example) to create the function handle

Ax = @(x) A*x;  if A is a matrix
Ax = A;              if A is a function handle

and then every time that there is a matrix-vector product, for example A*v, it is simply computed as Ax(v) or feval(Ax, v).

Instead, in the pcg implementation, every time that there is a matrix-vector product it is checked if A is a matrix or a function handle. For example if we need to compute A*v then there are the following lines of codes:

if (isnumeric(A))
  w = A*v;
  w = feval(A,v);

(the other algorithms use one of these two strategies, or similar variations).

Then it is natural the question: which is the best strategy?
To answer, I simulate these strategy and the simple matrix-vector product, then  I computed the times, using sparse matrices of different dimensions (n = 100,
n = 200, n = 500, n= 1000, n = 2000) in this way 

  A = sprand(n, n, rand(1));
  Ax = @(x) A*x;
  w = rand(length(A),1);

 # Estimation of the simply A*w
  for j =1:100
    u = A*w;
  t1 = toc;

 # Estimation of feval(Ax, w)
  for j = 1:100
    u = feval (Ax, w);
  t2 = toc;

# Estimation of the if block with A as matrix
  for j = 1:100
   if (isnumeric (A))
     u = A*w;
     u = feval(A,w);
  t3 = toc;

# Estimation of the if block with Ax the function handle
    for j = 1:100
   if (isnumeric (Ax))
     u = Ax*w;
     u = feval(Ax,w);
  t4 = toc;

obtaining (for example, since I used random matrices) as results:

n = 100
t1 =  0.0038838
t2 =  0.0068600
t3 =  0.0049579
t4 =  0.0076430

n = 200
t1 =  0.0051990
t2 =  0.0083339
t3 =  0.0062981
t4 =  0.0091600

n = 500
t1 =  0.0042169
t2 =  0.0071800
t3 =  0.0051830
t4 =  0.0081360

n = 1000
t1 =  0.055708
t2 =  0.059194
t3 =  0.056590
t4 =  0.060054

n = 2000
t1 =  0.15554
t2 =  0.16691
t3 =  0.16327
t4 =  0.16677

This test with different matrices can be found in the file quicktest.m in the folder First_week.

We can see that if we consider n<=1000, the best strategy is to use the simple matix-vector product (indeed t1 is the less value). Instead if we consider the two strategies of gmres and pcg, we notice that the "if block" used in pcg takes more time than the simple matrix-vector product and the evaluation of the function handle.
Instead the case n = 2000 is not a good example to see the differences of these approaches, probably because the matrix is too big and requires a lot of operations.
Then, after a discussion with the mentors, we decided that the "best strategy" is to write in every algorithm two nested function "ad hoc" to the type of the data, and if A, M1 and M2 are not of the same type, to transform all of them in function handle. To be clear, for example, as sketch of how it will be:

function [x, flag, relres, iter, resvec] = pcg (A, b, tol, maxit, M1, M2, x0)

  if A, M1, M2 are matrices
    [x, flag, relres, iter, resvec] = pcg_matrix (A, b, tol, maxit, M1, M2, x0)

  if A, M1, M2 are function handle
   [x, flag, relres, iter, resvec] = pcg_function (A, b, tol, maxit, M1, M2, x0)

  if  A, M1, M2 are not of the same type
   [x, flag, relres, iter, resvec] = pcg_mixed (A, b, tol, maxit, M1, M2, x0)


 Priority of the algorithms

The last part of this post is about the chosen priority of the improvements of the algorithms for the next weeks.
The mentors suggested me to improve the algorithms in the following order:
  1. pcg
  2. gmres
  3. bicgstab
  4. pcr
  5. qmr
  6. cgs
  7. bicg
This order is suggested because pcg, gmres and bicgstab are the most used to solve iteratively linear systems.
Then follows pcr because it is the only method already implemented in Octave that solve linear system when the input matrix A is Hermitian (bicgstab and gmres work for general matrices, instead pcg works for SPD matrices).
And in the last follows qmr, cgs and bicg because they are the less used methods, with cgs above bicg because it has to be written from scratch.

If there are something that is not clear or if there are some suggestions/advices, comment below or contact me via email at

Nessun commento:

Posta un commento