Tri-Accel: Curvature-Aware Precision-Adaptive and Memory-Elastic Optimization for Efficient GPU Usage

Mohsen Sheibanian; Pouya Shaeri; Alimohammad Beigi; Ryan T. Woo; Aryan Keluskar

arXiv:2508.16905·cs.LG·September 3, 2025

Tri-Accel: Curvature-Aware Precision-Adaptive and Memory-Elastic Optimization for Efficient GPU Usage

Mohsen Sheibanian, Pouya Shaeri, Alimohammad Beigi, Ryan T. Woo, Aryan Keluskar

PDF

Open Access

TL;DR

Tri-Accel is a unified framework that adaptively optimizes neural network training by dynamically adjusting precision, exploiting sparsity, and scaling batch size, significantly reducing training time and memory while maintaining or improving accuracy.

Contribution

It introduces a novel integrated approach combining precision adaptation, sparsity exploitation, and memory-aware batch scaling for efficient GPU training.

Findings

01

Up to 9.9% reduction in training time

02

13.3% lower memory usage

03

Maintains 78.1% accuracy with reduced memory footprint

Abstract

Deep neural networks are increasingly bottlenecked by the cost of optimization, both in terms of GPU memory and compute time. Existing acceleration techniques, such as mixed precision, second-order methods, and batch size scaling, are typically used in isolation. We present Tri-Accel, a unified optimization framework that co-adapts three acceleration strategies along with adaptive parameters during training: (1) Precision-Adaptive Updates that dynamically assign mixed-precision levels to layers based on curvature and gradient variance; (2) Sparse Second-Order Signals that exploit Hessian/Fisher sparsity patterns to guide precision and step size decisions; and (3) Memory-Elastic Batch Scaling that adjusts batch size in real time according to VRAM availability. On CIFAR-10 with ResNet-18 and EfficientNet-B0, Tri-Accel achieves up to 9.9% reduction in training time and 13.3% lower memory…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Parallel Computing and Optimization Techniques · Stochastic Gradient Optimization Techniques