Accelerating AI Performance using Anderson Extrapolation on GPUs
Saleem Abdul Fattah Ahmed Al Dajani, David E. Keyes

TL;DR
This paper introduces Anderson extrapolation to accelerate AI training and inference on GPUs, reducing iterations and improving scalability while maintaining accuracy and stability.
Contribution
It presents a novel extrapolation technique that optimizes GPU-based AI performance by balancing iteration count, speed, memory, and stability.
Findings
Significant speedups in training and inference.
Effective reduction in iteration count for convergence.
Enhanced scalability in high-performance computing environments.
Abstract
We present a novel approach for accelerating AI performance by leveraging Anderson extrapolation, a vector-to-vector mapping technique based on a window of historical iterations. By identifying the crossover point (Fig. 1) where a mixing penalty is incurred, the method focuses on reducing iterations to convergence, with fewer more compute-intensive but generally cacheable iterations, balancing speed and memory usage with accuracy and algorithmic stability, respectively. We demonstrate significant improvements, in both training and inference, motivated by scalability and efficiency extensions to the realm of high-performance computing (HPC).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeophysical Methods and Applications · Microwave Imaging and Scattering Analysis · Soil Moisture and Remote Sensing
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
