Understanding Power Consumption and Reliability of High-Bandwidth Memory with Voltage Underscaling
Seyed Saber Nabavi Larimi, Behzad Salami, Osman S. Unsal, Adrian, Cristal Kestelman, Hamid Sarbazi-Azad, Onur Mutlu

TL;DR
This paper experimentally investigates how voltage underscaling in High-Bandwidth Memory (HBM) affects power consumption and reliability, revealing significant power savings but also increased bit flip faults.
Contribution
First experimental analysis of HBM under voltage underscaling, characterizing power savings and fault rates, and proposing a fault map for power-reliability trade-offs.
Findings
Power consumption reduces by up to 2.3X with voltage underscaling.
Guardband regions account for 19% of nominal voltage.
Voltage reduction causes measurable bit flip faults.
Abstract
Modern computing devices employ High-Bandwidth Memory (HBM) to meet their memory bandwidth requirements. An HBM-enabled device consists of multiple DRAM layers stacked on top of one another next to a compute chip (e.g. CPU, GPU, and FPGA) in the same package. Although such HBM structures provide high bandwidth at a small form factor, the stacked memory layers consume a substantial portion of the package's power budget. Therefore, power-saving techniques that preserve the performance of HBM are desirable. Undervolting is one such technique: it reduces the supply voltage to decrease power consumption without reducing the device's operating frequency to avoid performance loss. Undervolting takes advantage of voltage guardbands put in place by manufacturers to ensure correct operation under all environmental conditions. However, reducing voltage without changing frequency can lead to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
