# Buddy Compression: Enabling Larger Memory for Deep Learning and HPC   Workloads on GPUs

**Authors:** Esha Choukse, Michael Sullivan, Mike O'Connor, Mattan Erez, Jeff Pool,, David Nellans, Steve Keckler

arXiv: 1903.02596 · 2019-04-17

## TL;DR

Buddy Compression enhances GPU memory capacity and bandwidth by compressing memory entries and splitting them between high-speed and disaggregated memory, enabling larger HPC and deep learning workloads with minimal slowdown.

## Contribution

This paper introduces Buddy Compression, a novel scheme that increases effective GPU memory capacity by compressing and splitting memory entries across device and system memory.

## Key findings

- Achieves an average compression ratio of 2.2x on HPC workloads.
- Achieves an average compression ratio of 1.5x on deep learning workloads.
- Imposes only a 1-2% slowdown on performance.

## Abstract

GPUs offer orders-of-magnitude higher memory bandwidth than traditional CPU-only systems. However, GPU device memory tends to be relatively small and the memory capacity can not be increased by the user. This paper describes Buddy Compression, a scheme to increase both the effective GPU memory capacity and bandwidth while avoiding the downsides of conventional memory-expanding strategies. Buddy Compression compresses GPU memory, splitting each compressed memory entry between high-speed device memory and a slower-but-larger disaggregated memory pool (or system memory). Highly-compressible memory entries can thus be accessed completely from device memory, while incompressible entries source their data using both on and off-device accesses. Increasing the effective GPU memory capacity enables us to run larger-memory-footprint HPC workloads and larger batch-sizes or models for DL workloads than current memory capacities would allow. We show that our solution achieves an average compression ratio of 2.2x on HPC workloads and 1.5x on DL workloads, with a slowdown of just 1~2%.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.02596/full.md

## Figures

36 figures with captions in the complete paper: https://tomesphere.com/paper/1903.02596/full.md

## References

68 references — full list in the complete paper: https://tomesphere.com/paper/1903.02596/full.md

---
Source: https://tomesphere.com/paper/1903.02596