Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Look-ahead Lanczos Biorthogonalization #321

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

platawiec
Copy link
Contributor

@platawiec platawiec commented Jul 28, 2022

Background

This PR introduces LookAheadLanczosDecomp, an iterable which performs the Look-ahead Lanczos process. There is a companion PR #322 for Quasi-minimal residual method (QMR) for solving linear systems which uses the Krylov subspace based on this technique.

The motivation for using the look-ahead process is for use in non-hermitian linear systems, particularly ones which aren't well-conditioned, where the standard Lanczos process encounters either exact or numerical breakdown. The look-ahead technique circumvents the breakdown by constructing blocks in the sequence, such that the generated sequences avoid breakdown.

Implementation Details

We follow the coupled two-term recurrence implementation of the Lanczos process, as in (Freund, 1994). This generates a V-W and a P-Q sequence. Things are kept generic and in principle this should be a GPU-capable implemention, which as far as I can tell would be a first GPU implementation of the LAL process.

Additionally, we follow the paper's implementation by using the matrix transpose to generate the companion Krylov sequence (not the matrix hermitian conjugate).

Limited memory access and allocations

Compared to the regular Lanczos process, we generate longer (perhaps arbitrarily long) vectors which describe the recurrence relations the process uses. Therefore, the values must be stored in flexible containers. We introduce limited-memory matrices which respect the allocation of memory the user provides, and these come in three flavors:

  1. LimitedMemoryMatrix, which stores columns (for instance the V and W sequence vectors), up to the memory limit
  2. LimitedMemoryUpperTriangular, which stores a recurrence relation in upper triangular form
  3. LimitedMemoryUpperHessenberg, which stores a recurrence relation in upper Hessenberg form
    For the latter two, access to values outside the memory range return 0. By construction, for a given step in the Lanczos sequence, we will only access the values within the memory of either of these matrices. If, however, the user requests too little memory, then the process will error. This should be rare.

The optimizations here could be pushed further - the matrices describing the recurrences are more structurally sparse than what is implied by the LimitedMemory matrices - but that would introduce more code complexity so I decided to keep it simpler. Likewise, in places we store larger vectors than we strictly need to, but this is not significant compared to the matrix-vector products.

Differences from paper

When encountering a block that is too large (above the user-supplied max_block_size), the process closes the block and continues. This is in contrast to Freund & Nachtigal, which recommend resetting the estimate of the operator norm and re-computing the block.

We also test breakdown of the D and E operators against a tolerances of eps()^(1/3). This is suggested in Freund 1993, though Freund 1994 uses plain eps(). In my experience eps() is too conservative, hence why I went with the former.

Tests

This implementation is tested against some contrived matrices and starting vectors which guarantee the generation of blocks in the look-ahead Lanczos sequence. These are then tested against identities derived in the Lanczos process. I welcome additional suggestions.

Future directions

  • Further micro-optimization, validation on GPU
  • A specialized implementation for complex-symmetric (or symmetric) matrices, which reduces by a factor of ~2x the memory usage and matrix-vector products.

References

  • Freund, R. W., & Nachtigal, N. M. (1994). An Implementation of the QMR Method Based on Coupled Two-Term Recurrences. SIAM Journal on Scientific Computing, 15(2), 313–337. https://doi.org/10.1137/0915022
  • Freund, R. W., Gutknecht, M. H., & Nachtigal, N. M. (1993). An Implementation of the Look-Ahead Lanczos Algorithm for Non-Hermitian Matrices. SIAM Journal on Scientific Computing, 14(1), 137–158. https://doi.org/10.1137/0914009

@platawiec
Copy link
Contributor Author

Remaining test failures are due to LOBPCG. I understand this PR is a lot but I would appreciate a comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant