martedì 30 agosto 2016

CGS, TFQMR and final revision of all the codes

Dear all,

During these weeks I worked on improving CGS, which I rewrite from scratch. Then I implemented TFQMR, which is not available in Octave. Finally I made a revision on the codes that I improved the past weeks.
All the file revisioned (gmres, pcg, bicg, bicgstab) and the new ones (cgs and tfqmr) are available here:

 https://CrisDorigo@bitbucket.org/CrisDorigo/socis16-octave-iter_meth

in the folder definitive_codes

CGS 

As written in my second post, the actual Octave's CGS implementation is not very clear. Indeed I didn't understand most of the code (I was not able to recognize how the is was adapted). Moreover I tested it and the results are not correct (in my second post there is an example of this).
Then I decided to rewrite if from scratch following closely the algorithm given by Saad in the reference book suggested.
I wrote it following the pattern that I used in the other codes (i.e. to wrote three subscripts, each for the different cases: all matrices, all functions handle or mixed). Iit has also the same characteristics of the other methods improved, for example: the approximation returned is the one with the minimum residual, there are five types of flag (0 convergence, 1 maximum number of iterations reached, 2 preconditioner singular, 3 stagnation, 4 division by zero during the process)... I don't go in deep about this details to make less boring this post, since they are already explained in my previous posts. 

TFQMR

In my last post I wrote that the mentors told me to implement a new algorithm not available in Octave. Since I had no suggestion about which method write, and the community had no particular needs, I chose the transpose-free qmr algorithm because it is the only methods treated in the Saad book.
Also with this algorithm, I use the same pattern of the others (i.e. the three subscripts) and synchronized in such a way it has the same type of input and output as the others.

One mention to the number of iterations effectively performed. In the Saad book, the method has different behaviours if the iteration number is even or if it is odd, but at every iteration the approximation is computed. Making some test in Matlab with its tfmqr, I noticed that if the method converges, the vector of the residuals in long two times the number of iteration that Matlab tells that are necessary to convergence. Then I think that Matlab counts as one iteration the block (odd iteration + even iteration), i.e. I think that Matlab_it = odd_Saad_it + even_Saad_it. In this way the iterations performed by Matlab is two times the effective iterations performed. I decided to count the iterations in this way to be compatible.

LAST REVISION

Finally I checked (one more time) the codes that I improved in these weeks (i.e. pcg, gmres, bicg, bicgstab).
Indeed I updated the documentation (for example: making all similar, giving more details about the preconditioning, adding the reference book,...).
Then I checked the codes and I tried to make it more readable and to adapt it at the Octave conventions.
Then I added some tests in all the codes, mostly to check if all the subscripts works (i.e. checked every combination of A matrix/function, M1 matrix/function, M2 matrix/function), but also to check that the flags works well (I checked only the flag 0, 1 and 2, since the others are difficult to reproduce) and also to check that these methods solves also complex linear systems.

The mentors suggest me to make a patch with the methods improved (pcg, gmres, bicg, bicgstab and cgs), in such a way it can be easily the review of them and if it is in time (maybe) they can be included in Octave 4.2.
This patch is available here

https://CrisDorigo@bitbucket.org/CrisDorigo/socis16-octave-iter_meth

and it is the file SoCiS16_Improve_iterative_methods.diff

I made this patch from the Octave changeset 22406:dc4e2203cd15 of the Octave 4.1.0+ version.


THANKS TO OCTAVE


Since the official deadline for the SoCiS  is tomorrow (the 31-th of August), this is my last post in this blog.

To make a small summary, during this summer of codes I make an improving on
most of the iterative methods to solve linear systems. In particular I improved pcg, gmres, bicg, bicgstab and cgs.
I wrote for all of these methods three subscripts to make them more efficient, dividing the particular cases "matrix", "function handle" and "mixed", in such a way that, according to the input data, the method uses the appropriate subscripts to make computations.
I tried to make these methods more compatible with Matlab (for example noticing that Matlab gives as output the approximation with the minimum residual and not the last performed, as used in Octave, or adding the possibility of "flag =2", i.e. singular preconditioner).
I tried to fix some "non-clear" situation that I noticed, for example pcg does not noticed if the matrix was not Hermitian positive definite if it was a complex matrix, or to face the problem of a division by zero in bicg, bicgstab and cgs.
Then I implemented tfqmr which is available in Matlab, but not in Octave.

Unfortunately, there are other two methods that needs an improve (pcr and qmr), but due to unexpected problems with the above methods I had not enough time to improve also these. When my scholastic commitments are not so big, I want to try to improve also these two methods.

I want to thank all the Octave community that gives me this fantastic opportunity to face a big project as this and then to learn  much theoretically and practical things about these iterative solvers for linear systems.
I want to thank specially the mentors, who were always available to give suggestions, advices and ideas of how to proceed in the work, mostly when came some unexpected problems.
I hope that this improving will be useful to Octave, and to contact me if there is something not clear or wrong.
I also hope to continue to help the Octave community in the following times.

One more thanks to all of you.

martedì 2 agosto 2016

BICG and BICGSTAB

Dear all,

In this last period I worked on the BICG and BICGSTAB algorithm.
You can find the codes in the folder bicg_bicgstab at
https://CrisDorigo@bitbucket.org/CrisDorigo/socis16-octave-iter_meth

Since the main loop of both of them have more or less 30-40 lines of code, I wrote them from scratch, because in my opinion was easier instead to modify and to adapt the original code at the many situations that occurred.

The strategy that I used is the same of PCG and GMRES: for each algorithm I wrote three sub-scripts to subdivide the cases which: (a) the matrix and the preconditioners are all matrices, (b) all functions handle or (c) some are matrices and some functions handle.

As for PCG and GMRES, I tried to make these codes compatible with Matlab.

In my BICG and BICGSTAB codes there are not actually the documentation, but I'll add them as soon as possible


Main modifications on BICGSTAB and BICG

Many of the changes that I applied at the BICGSTAB and BICG were quite similar to those that I made in PCG and GMRES. I make a little list to summarize them: 

  • Now the approximation of the solution X returned is the one that has the minimum residual and not the last one computed by the algorithm. Also the output variable ITER contains the iteration in which X was computed, not the number of iterations performed.
  • In the original Octave BICGSTAB code the stagnation criterion is
    resvec(end) == resvec(end - 1).
    Instead in the BICG code it is: res0 <= res1 (where res1 is the actual residual and res0 the previous)
    I changed this criterion with:
    norm(x - x_pr) <= norm(x)*eps
    (where x_pr is the previous iteration approximation)
    This criterion is more efficient, since it checks the variation of the approximations and not the variation of the residual, and it seems more similar to the "stagnation" described in the Matlab help. Moreover in BICGSTAB (and in BICG) there are many fluctuations of the residuals (they are not decrescent in general), so it's better to check directly the approximations.
    Due to these fluctuactions, the BICG criterion is wrong, since in this manner the Octave BICG usually stops also if it is not necessary.
  • I added the "flag = 2" criterion, the one that Matlab describes as "preconditioner matrix ill-conditioned", that in Octave is skipped. The check that I implemented is for the singularity of the preconditioner matrix M, since seems that in Matlab this flag appears only in this situation.
Now I will focus on some situations regarding BICGSTAB:
  •  I added the "half iterations" as in Matlab: studying the Saad implementation, I noticed that the approximation X is made in two steps at every iteration, indeed it is defined as:
    x = x + alpha*p + omega*s
    Then to be compatible with Matlab (since p/alpha and s/omega are not related and computed in different situations), I set x = x + alpha*p (and its residual) as the half iteration, and x = x + alpha*p + omega*s as the complete iteration.
    In this way  the "real" total number of iteration performed (recalling that ITER contains the iteration which X was computed) are
    (length(RESVEC) - 1) / 2.   
  • There was a problem when the matrices A, M1, M2 and/or the right-hand-side b are complex, indeed many times the residual blow up and the method does not converge. In the Saad reference book (where it was taken the Octave BICGSTAB, I think), the method is formulated only for the real case. The only difference is that using this formulation, some inner product are inverted (e.g. it uses <v, w>, instead of <w, v>). In the real case there is no problem since the inner real product is symmetric, but in the complex not, indeed the result of <v, w> is the complex conjugate of <w, v>.
  • In the Matlab help I noticed that there is the possibility of "flag = 4". I tried to reproduce it, then I used the same input data in the Octave BICGSTAB: the approximation X was all composed by NaN, and also for the residuals. Then I study in deep the Octave code and the reference book. I discovered that ih this situation there was some diviosn by zero probably caused by the Non-Symmetric Lanzcos Biorthogonalization (that is used in both BICG and BICGSTAB).
    With this Lanczos algorithm is possible the situation of a "serious breakdown" (not going into deep details, but it happens when a certain inner product is zero, instead of a non-zero number), and there is not easy examples to reproduce it. The refernce book suggest two possibility to avoid this situation: to restart the Lanczos biortho, or to use a difficult variant called "Look-Ahead Lanczos".
    I tried to study the "Look-Ahead" but in the Saad book is only mentioned the general idea, and it is not simple to implement and to understand quickly. Then I tried the second strategy it is not efficient (an easy experiment to try this strategy is: run BICGSTAB to find the last iteration without NaN, then run another time the BICGSTAB with the output X as initial guess), indeed the method does not converge and the residual does not decrease.
    Then (since in Matlab is not mentioned if it uses or not the Look Ahead) in the BICGSTAB implementation that I wrote I don't use any strategy and I set "flag = 4" when the guilty inner product is zero.
    To see an example of this situation in the Octave BICGSTAB see the file example_bicgistab.m
Instead, during the implementation of BICG there was not (for now) strange situations. For both BICG and BICGSTAB I tried the same tests that are given in their Octave implementations and I add some test to try the "flag =2" and "flag = 4" situations.

DURING THE REMAINING TIME

I know that to fix PCG, GMRES, BICG and BICGSTAB I used a lot of time. This because in their implementation emerged a lot of strange situations that needed a deep revision of these algorithms and many times also a discussion (with the mentors or the community) of how to proceed to solve particular problems. Also the mentors told me that they expected less time/work to improve and make a revision of the codes already implemented.
They also told me to spend the remaning time of the SoCiS to implement some methods not present in Octave. For this, I want to ask you if there is one of the suggested algorithms (minres, symmlq, symmetr, tfqmr, bicgstabl or lsqr) that you prefer to implement.

As usual, if there is something wrong/not clear, please contact me via e-mail or write a comment.