Sustainable AI Training via Hardware-Software Co-Design on NVIDIA, AMD, and Emerging GPU Architectures
Yashasvi Makin, Rahul Maliakkal

TL;DR
This paper investigates hardware-software co-design techniques across NVIDIA, AMD, and emerging GPU architectures to significantly improve energy efficiency in AI training, addressing sustainability challenges in large-scale deep learning.
Contribution
It introduces novel hardware-software co-design strategies tailored for advanced GPU architectures to enhance energy efficiency in AI training.
Findings
Energy efficiency increases through specialized tensor and matrix cores
Advanced memory optimization methods improve performance-per-watt
Real-world case studies demonstrate practical sustainability improvements
Abstract
In particular, large-scale deep learning and artificial intelligence model training uses a lot of computational power and energy, so it poses serious sustainability issues. The fast rise in model complexity has resulted in exponential increases in energy consumption, increasing the demand for techniques maximizing computational efficiency and lowering environmental impact. This work explores environmentally driven performance optimization methods especially intended for advanced GPU architectures from NVIDIA, AMD, and other emerging GPU architectures. Our main focus is on investigating hardware-software co-design techniques meant to significantly increase memory-level and kernel-level operations, so improving performance-per-watt measures. Our thorough research encompasses evaluations of specialized tensor and matrix cores, advanced memory optimization methods, and creative integration…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
