Scalable s-step Preconditioned Conjugate Gradient with Chebyshev Basis and Gauss-Seidel Gram Solve
Pasqua D'Ambra, Massimo Bernaschi, Mauro G. Carrozzo, Stephen Thomas

TL;DR
This paper introduces a scalable s-step PCG method combining Chebyshev stabilization and Gauss-Seidel iteration, reducing synchronization costs while maintaining convergence on GPU architectures.
Contribution
It develops a novel s-step PCG variant with Chebyshev basis and Gauss-Seidel solve, supported by theoretical analysis and large-scale GPU experiments.
Findings
Achieves convergence comparable to classical CG
Reduces synchronization overhead on GPU architectures
Provides theoretical analysis supporting small FGS sweep usage
Abstract
We present a variant of the s-step Preconditioned Conjugate Gradient (PCG) method that combines a Chebyshev-stabilized Krylov basis with a Forward Gauss-Seidel (FGS) iteration for the solution of the reduced Gram systems. In s-step Conjugate Gradient, multiple search directions are generated per outer iteration, reducing global synchronization costs but requiring the solution of small dense Gram systems whose conditioning is critical for stability. We analyze the structure of the Chebyshev Gram matrix and show that its moment-based representation is associated with favorable conditioning properties for moderate step sizes. Building on inexact Krylov theory and on the classical equivalence between FGS and Modified Gram-Schmidt (MGS), we provide a structural analysis and theoretical rationale supporting the use of a small number of FGS sweeps, while preserving the convergence behavior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
