GNU Octave SOCIS16 : Improve iterative methods for sparse linear systems: luglio 2016

Hi all,

during these two weeks I tried to fix some problems in pcg and gmres.
You can find the codes at
https://bitbucket.org/CrisDorigo/socis16-octave-iter_meth/src
in the folder "pcg_gmres".

PCG

Question on my previous post

In my previous post I had some problem in pcg with the check if the matrix was or not positive definite. Indeed the check done in the Octave code was not precise, especially if involved complex numbers. (To recall: in the algorithm is computed alpha = (r' * r) / (p' * (A*p)) and the control was: if alpha <= 0, break. The problem is that also if alpha was complex the algorithm must stop, moreover for Octave every complex number is greater than 0, so pcg didn't notice if alpha has an imaginary part).

After a (brief) discussion with the Octave community I changed the control on alpha. First of all it checks separately the numerator and the denominator of alpha (because if there is a preconditioner matrix then the definition of alpha "slightly" changes and also the numerator can be negative and/or complex), then the algorithm stops if

(abs (imag (num)) >= eps (real (num))*tol/eps) or real (num) <= 0

(the same control is made for den).

In this way the algorithm stops if really the numerator (or denominator) of alpha is negative or if it has an imaginary part "too big" (this is relative to the tolerance, because we thought that it is not the case to stop the algorithm if the imaginary part is, for example, 1e-16). If this happen then the FLAG output is setted as 4.

Unfortunately, alpha can be real and positive at every iteration also if the matrix is not positive definite, so there are cases when this check doesn't works, but there are nothing that we can do.

Other changes

I made other changes to make the code more compatible with Matlab. The most relevant are:

Making some tests with Matlab I noticed that the output approximation X is not the last computed, but the one with the minimum residual and the output variable IT gives the iteration which X was computed. The Matlab help is not very explicit about this. I noticed this fact because controlling its outputs, sometimes lenght(RESVEC) is greater than IT, where RESVEC is the vector of the residuals. So at first I thought that there was something wrong (it was strange that there are 10, 20 or 30 residuals more than iterations), then I noticed this "minimum" relation. Then I modified the pcg in such a way also the Octave one has this output.
I added the check about the stagnation. Indeed there is the possibility that the algorithm stagnates (in general when the (preconditioned) matrix A is not positive definite but the algorithm doesn't notice it). So I add the control:
if (norm (x - x_pr) <= eps*norm (x)) , where x_pr is the approximation of the previous iteration, then the algorithm stagnates and the FLAG output is setted as 3.
I added a check to control if the preconditioner matrix is singular, and if this happens then FLAG = 2. To do this, I use a try-catch approach:

try

warning("error","Octave:singular-matrix");

. . .   (pcg computations)

catch
flag = 2;
end_try_catch

In this way the warning "Octave:singular-matrix" is setted as an error, and with the try-catch I control if this error appears.
I used this approach because in Matlab and Octave there are different results when we try to solve M \ b, with M singular:
In Matlab there is always a warning and the output has some NaN or Inf;
I Octave the warning appears only the first time that we try to solve it and the output is the least squares minimization.
Unfortunately, since in Octave this warning appears only the first time, if we use pcg for a second time (without quitting Octave) with the same linear system with singular M, then FLAG = 2 is not setted.
There is another case where this approach doesn't work: when the matrix is setted as "Diagonal Matrix" then this warning doesn't appear. For example:

octave-cli:3> M1 = eye(3);
octave-cli:4> M1(3,3) = 0
M1 =

Diagonal Matrix

   1   0   0
   0   1   0
   0   0   0

octave-cli:5> M2 = [0 0 0;0 1 0; 0 0 1]
M2 =

   0   0   0
   0   1   0
   0   0   1

octave-cli:6> M1 \ ones(3,1)
ans =

   1
   1
   0

octave-cli:7> M2 \ ones(3,1)
warning: matrix singular to machine precision
ans =

   0
   1
   1

In this case M1 and M2 are different singular matrices, but if I try to solve a linear system with them there is a warning only with M2.
I setted the output flag as in Matlab:
FLAG = 0 : algorithm converged
FLAG = 1 : maximum number of iterations reached
FLAG = 2 : preconditioner matrix is singular
FLAG = 3 : stagnation
FLAG = 4: (preconditioned) matrix A not positive definite

About FLAG = 2 and FLAG = 4:

"FLAG = 2": the Matlab help tell that this flag appears when M is ill-conditioned, but the only case in which I can reproduce this flag is when M is singular. But singular and ill-conditioned matrix are different concepts, (ill-conditioned when the condition number is high, singular when det(A) is zero). Indeed, for example, Hilbert matrices are ill-conditioned but not singular. Since in Matlab I had FLAG = 2 only when M singular, in the code and in my pcg help is written that this flag appears only when M is singular, and not when M ill-conditioned.

"FLAG = 4": Also with this flag Matlab is not very clear. In the help is written that this flag appears when the algorithm can not continue due to some values too big or too small. But I can reproduce it only when A is not positive definite, so in my pcg help I wrote that it appears only when the algorithm detect that A is not positive definite.
I fixed the output strings that appears when there are asked less than 2 outputs and I wrote them in such a way they are similar to the Matlab ones.
I update the pcg help, in such a way it is synchronized with these changes.

GMRES

More or less I made the same changes of the pcg algorithm also in gmres (the discussion about the positive definitiveness of the matrix doesn't care here, since the gmres algorithms must works with any matrix, so there are 3 types of flag and not 4 as in the pcg).
One little things about the variable IT. In both Matlab and Octave, there are written in the help that in the variable IT are memorized the inner and the outer iterations, but they are not so clear about what they are. For this reason in the help I added a little deep explanation about what they are and how they are related with the total number of iterations. In particular the inner iterations are the iteration before that the restart is applied, instead the outer iterations count how many times the restart is applied. Then the total number of iterations are:
(Outer Iterations)*RESTART + (Inner Iterations).

I think that the most of the work about these two algorithms is done. Up to small changes and/or bugs that I can find trying my codes I think that they are more or less definitive.

If you find some bugs, you find something not clear, or have some advices please contact me.

Dear all,

In this second and half week I worked on the pcg to try to improve it.
All the codes can be found at
https://bitbucket.org/CrisDorigo/socis16-octave-iter_meth/src
in the folder "pcg_codes".

The first thing that I made is the private function __checkandstring__ (the name is not definitive).
This function checks if the input argument A, b, M1 and M2 are consistent, i.e. if A, M1, M2 are matrices or function handle, and if A is a matrix, checks also that the number of columns of A are equal to the number of rows of b.
In addition, this function has as output two strings: is_prec and type.
The first one can be "prec" if M1 is not empty, or "no_prec" otherwise (I check only M1 because in the pcg documentation is written that if M1 is empty then no preconditioner is applied).
Instead the string type can be "matrix", "function_handle" or "mixed". If M1 empty then type is "matrix" if A is a matrix, "function handle" otherwise.
If M1 is not empty then the string type is "matrix" if A, M1, M2 matrices, "function_handle" if they are all function handle or "mixed" if not all of them are matrices or function handle.
I made this function as general as possible to use it also for the other algorithms that need an improvement in my project.
You can see the code in the function __checkandstring__.m.

After that I take the pcg Octave code and I adapted it to the different situation:
I made a switch-case that treat differently the three cases mentioned before ("matrix", "function handle" and "mixed"). Moreover the cases "matrix" and "function handle", have an if block that check if there exist the preconditioners: if they are not passed, then applies the unpreconditioned conjugate gradient, otherwise applies the preconditioned one.
For the "mixed" case instead, differently from my previous post (in which I said that if A, M1 and M2 are not of the same type, then I set them as functions handle) I check every time the type of A / M1 / M2 and if it is a matrix then I apply the simple matrix-vector product, if it is a function handle then I made the evaluation.
You can see this code in the function pcg_tmp_02.m.

I made this choice because I noticed that it is more (time) efficient than to transform all in functions handle. Indeed, before the two codes mentioned above, I made a function similar to __checkandstring__ that in case A, M1 and M2 not of the same type then transform them all in function handle, and then I made a version of pcg that works with this function (you can see these code in the files __check02__ and pcg_tmp_01 respectively). Then I tested the octave pcg, pcg_tmp_01 and pcg_tmp_02 20 times on the same example, with and without preconditioners, and passing all the variables as matrices, as function handles and not all of the same type. At the end I plotted the times and I compared it and I noticed that the better performs are given form pcg_tmp_02.
You can see this example in the file example_pcg.m.

Talking with the mentors, they told me that (using as example pcg_tmp_02) the code is too long. This because the pcg code is "repeated" 5 times:

matrix case without preconditioners
matrix case with preconditioners
function handle case without preconditioners
function handle case with preconditioners
mixed case (it is only with preconditiones because otherwise there is only A and it is or a matrix or a function handle)

They suggested me to substitute this switch-case approach with another approach, for example to use nested function or to insert some scripts in the private folder. They told me to ask to all the maintainers which is your preferred approach or to suggest me another one in case.

Complex case

Making some tests with complex matrices, I noticed strange behaviour of the pcg.
For example if I test the Octave pcg with the matrix

A = [2 , 2+1i, 4;
        2-1i , 3, 1i;
        4 , -1i , 1 ]

that is an hermitian matrix, but not positive definite (indeed its eigenvalue are not all positive: -3.16512506440744e+00   2.85307928516935e+00 6.31204577923809e+00 )
and with right-hand-side

b = [ 5.59693687377178e-01;
       2.42878791850453e-02;
         9.77795880620014e-01 ]

the Octave pcg doesn't notice that this matrix is not positive definite.
Indeed the result is:

x = pcg(A,b)
warning: pcg: maximum number of iterations (3) reached
warning: the initial residual norm was reduced 2.12736e+15 times.
x =

   2.04219909327648e-01 - 3.21781116721895e-02i
-1.11630690554651e-01 + 5.38299446813016e-02i
   1.07086298628122e-01 + 1.70817561341068e-02i

Studying the code, I notice that the only check for the positive definitiveness of the matrix is on alpha. Not going into deep details, this alpha is necessary to compute the approximant solution at every step (indeed at every iteration, x = x + alpha*p for a certain p).
If there are no preconditioners (as in our example) alpha is defined as:
alpha = < r, r > / < p, A*p >
(where < , > is the vector scalar product, and r is the residual of the previous iteration).
Then, since we suppose that A is positive definite, alpha must be real and positive (both in case A real or complex matrix).
Since the numerator is positive for every r, if alpha is negative then surely A is not positive definite. Indeed the check in the Octave pcg code is:

if (alpha <= 0.0)
      ## Negative matrix.
      matrix_positive_definite = false;
    endif

If A is not positive definite then < p, A*p> is not always negative, so if we are "lucky", if we pass a not positive definite matrix in pcg and from the computation alpha is positive at every iteration, there is nothing that we can do.
But if we use A and b definite as previous, pcg makes three iterations and the alpha are:

alpha = 2.11061741184511e-01 - 6.08511684384459e-20i
alpha = -5.59991838158184e-01 + 1.05798887633281e-17i
alpha = 1.48434184396776e-01 + 8.13133738226767e-18i

(the function pcg01.m in the folder "pcg_codes" is the octave version of pcg that print at every iteration alpha)

We notice that these three alpha have an imaginary part, but it is under the machine precision, so they are not computationally relevant.
But the second alpha is (obviously) negative, and the pcg doesn't notice it!

This because for Octave:

octave4.1-cli:29> -5.59991838158184e-01 + 1.05798887633281e-17i < 0
ans = 0

and

octave4.1-cli:30> -5.59991838158184e-01 + 1.05798887633281e-17i > 0
ans = 1

Making some tests I notice that a number with an imaginary part is greater than 0, also if this imaginary part is negative, indeed

octave4.1-cli:31> -3-1i > 0
ans = 1

I don't understand very well what Octave do to compare two complex numbers (I think that it compares the modules, i.e. if we want to verify if a < b, where at least one of a or b is complex, then it compares abs(a) and abs(b), but I'm not sure).
I compared some complex numbers also in Matlab, and I think that it compares only the real part (and "forgetting" the imaginary part).

I talked with the mentors about this fact, but there are some questions:

Is it correct to check only the real part of alpha?
Also if the imaginary part of alpha is relevant?
(if for example we use the matrix

A = [ 0.71290 + 0.59353i   0.97470 + 0.36591i   0.50060 + 0.53652i;
   0.37411 + 0.11662i   0.38904 + 0.43489i   0.03555 + 0.23431i;
   0.35482 + 0.23601i   0.44859 + 0.31402i   0.54356 + 0.72676i]

that is a matrix not symmetric, and not positive definite,
and b = ones(3,1), we obtain as alpha:

alpha = 0.47882 - 0.51052i
alpha = 1.4276 - 1.0314i
alpha = 0.43535 - 0.36971i

and the pcg doesn't give any flag of this "non-correctness" of A, but also if we check only the real part of alpha the code doesn't give any flag or any error)
Do we break the algorithm if alpha is complex? Also if the imaginary part is under the machine precision?
(it is possible that also if A is positive definite and symmetric but the alpha are complex, because of computations and the machine precision)

The mentors suggested me two things:

First of all to check both numerator and denominator of alpha: this because if there are preconditioners also the numerator can be negative, since it becomes <M*r, r>, where M is the inverse of M1*M2.
To break the algorithm if the imaginary part of (the numerator or the denominator of) alpha is relevant. But, how to decide if it is relevant?
For example, 1e-15 is relevant? And 1e-14?
So they suggested me a possible criterion to use to determine the relevancy of this imaginary part:

if real(alpha) + (eps / tol)*imag(alpha) == real(alpha)
break
endif

The motivation of this criterion is because, since we want to find a solution within a tolerance (tol), we "normalize" the relevancy of this imaginary part with this tolerance.

The mentors also told me to ask also to all the maintainers, because this is their suggestion, but can be different from the direction that wants the Octave community.
Then I please ask from the community some advices of how to proceed for this situation.

Actually (and until I'll have some advices) I'm working on the gmres, and I'll try to improve it in the same manner of the first part of this post for the pcg.

Please contact me with some feedback via e-mail at cristiano.dorigo@hotmail.it or as a comment under this post.

GNU Octave SOCIS16 : Improve iterative methods for sparse linear systems

lunedì 18 luglio 2016

Pcg and Gmres

domenica 3 luglio 2016

Pcg analysis