Scalable s-step Preconditioned Conjugate Gradient with Chebyshev Basis and Gauss-Seidel Gram Solve

Pasqua D'Ambra; Massimo Bernaschi; Mauro G. Carrozzo; Stephen Thomas

arXiv:2603.09790·math.NA·March 30, 2026

Scalable s-step Preconditioned Conjugate Gradient with Chebyshev Basis and Gauss-Seidel Gram Solve

Pasqua D'Ambra, Massimo Bernaschi, Mauro G. Carrozzo, Stephen Thomas

PDF

TL;DR

This paper introduces a scalable s-step PCG method combining Chebyshev stabilization and Gauss-Seidel iteration, reducing synchronization costs while maintaining convergence on GPU architectures.

Contribution

It develops a novel s-step PCG variant with Chebyshev basis and Gauss-Seidel solve, supported by theoretical analysis and large-scale GPU experiments.

Findings

01

Achieves convergence comparable to classical CG

02

Reduces synchronization overhead on GPU architectures

03

Provides theoretical analysis supporting small FGS sweep usage

Abstract

We present a variant of the s-step Preconditioned Conjugate Gradient (PCG) method that combines a Chebyshev-stabilized Krylov basis with a Forward Gauss-Seidel (FGS) iteration for the solution of the reduced Gram systems. In s-step Conjugate Gradient, multiple search directions are generated per outer iteration, reducing global synchronization costs but requiring the solution of small dense Gram systems whose conditioning is critical for stability. We analyze the structure of the Chebyshev Gram matrix and show that its moment-based representation is associated with favorable conditioning properties for moderate step sizes. Building on inexact Krylov theory and on the classical equivalence between FGS and Modified Gram-Schmidt (MGS), we provide a structural analysis and theoretical rationale supporting the use of a small number of FGS sweeps, while preserving the convergence behavior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.