Training Personalized Recommendation Systems from (GPU) Scratch: Look   Forward not Backwards

Youngeun Kwon; Minsoo Rhu

arXiv:2205.04702·cs.AR·May 11, 2022·1 cites

Training Personalized Recommendation Systems from (GPU) Scratch: Look Forward not Backwards

Youngeun Kwon, Minsoo Rhu

PDF

Open Access

TL;DR

This paper introduces ScratchPipe, a novel GPU-based embedding cache architecture for personalized recommendation systems that captures both past and future accesses, enabling faster training without relying on large CPU memory.

Contribution

It proposes a new embedding cache design that leverages RecSys training properties to keep the active working set in GPU memory, overcoming previous limitations.

Findings

01

Enables GPU memory-speed training of embeddings

02

Reduces memory bandwidth bottlenecks in RecSys training

03

Outperforms existing cache-based approaches

Abstract

Personalized recommendation models (RecSys) are one of the most popular machine learning workload serviced by hyperscalers. A critical challenge of training RecSys is its high memory capacity requirements, reaching hundreds of GBs to TBs of model size. In RecSys, the so-called embedding layers account for the majority of memory usage so current systems employ a hybrid CPU-GPU design to have the large CPU memory store the memory hungry embedding layers. Unfortunately, training embeddings involve several memory bandwidth intensive operations which is at odds with the slow CPU memory, causing performance overheads. Prior work proposed to cache frequently accessed embeddings inside GPU memory as means to filter down the embedding layer traffic to CPU memory, but this paper observes several limitations with such cache design. In this work, we present a fundamentally different approach in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Stochastic Gradient Optimization Techniques · Caching and Content Delivery