Callipepla: Stream Centric Instruction Set and Mixed Precision for Accelerating Conjugate Gradient Solver
Linghao Song, Licheng Guo, Suhail Basalama, Yuze Chi, Robert F. Lucas,, Jason Cong

TL;DR
Callipepla introduces a stream-centric instruction set, vector streaming reuse, and mixed precision techniques to accelerate conjugate gradient solvers on FPGA, achieving significant speedup and energy efficiency improvements.
Contribution
It presents novel FPGA acceleration methods including a stream-centric instruction set and vector streaming reuse to optimize conjugate gradient solver performance.
Findings
Achieves 3.94x speedup over Xilinx HPC product
Provides 3.36x higher throughput than Xilinx HPC
Attains 2.94x better energy efficiency than Xilinx HPC
Abstract
The continued growth in the processing power of FPGAs coupled with high bandwidth memories (HBM), makes systems like the Xilinx U280 credible platforms for linear solvers which often dominate the run time of scientific and engineering applications. In this paper, we present Callipepla, an accelerator for a preconditioned conjugate gradient linear solver (CG). FPGA acceleration of CG faces three challenges: (1) how to support an arbitrary problem and terminate acceleration processing on the fly, (2) how to coordinate long-vector data flow among processing modules, and (3) how to save off-chip memory bandwidth and maintain double (FP64) precision accuracy. To tackle the three challenges, we present (1) a stream-centric instruction set for efficient streaming processing and control, (2) vector streaming reuse (VSR) and decentralized vector flow scheduling to coordinate vector data flow…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · Advanced Optimization Algorithms Research · Parallel Computing and Optimization Techniques
