A GPU Register File using Static Data Compression
Alexandra Angerd, Erik Sintorn, Per Stenstr\"om

TL;DR
This paper proposes a novel GPU register file organization that uses static data compression to improve register utilization, resulting in significant performance gains with minimal quality loss.
Contribution
It introduces a hardware/software co-designed register file architecture leveraging static analysis for efficient register packing of narrow operands.
Findings
Performance improved by up to 79%
Average performance improvement of 18.6%
Achieved these gains with modest output-quality degradation
Abstract
GPUs rely on large register files to unlock thread-level parallelism for high throughput. Unfortunately, large register files are power hungry, making it important to seek for new approaches to improve their utilization. This paper introduces a new register file organization for efficient register-packing of narrow integer and floating-point operands designed to leverage on advances in static analysis. We show that the hardware/software co-designed register file organization yields a performance improvement of up to 79%, and 18.6%, on average, at a modest output-quality degradation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
