RecPipe: Co-designing Models and Hardware to Jointly Optimize   Recommendation Quality and Performance

Udit Gupta; Samuel Hsia; Jeff Zhang; Mark Wilkening; Javin Pombra,; Hsien-Hsin S. Lee; Gu-Yeon Wei; Carole-Jean Wu; David Brooks

arXiv:2105.08820·cs.AR·May 25, 2021

RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance

Udit Gupta, Samuel Hsia, Jeff Zhang, Mark Wilkening, Javin Pombra,, Hsien-Hsin S. Lee, Gu-Yeon Wei, Carole-Jean Wu, David Brooks

PDF

1 Repo

TL;DR

RecPipe introduces a co-designed system and hardware accelerator that jointly optimize recommendation quality and performance, achieving significant improvements in latency and throughput for deep learning recommendation systems.

Contribution

The paper presents RecPipe, a novel multi-stage pipeline decomposition and hardware accelerator design that jointly optimize recommendation quality and system efficiency.

Findings

01

RPAccel improves latency by 3x and throughput by 6x at iso-quality.

02

RecPipe's multi-stage pipeline enables better parallelism and efficiency.

03

Hardware-aware scheduling enhances ranking performance on commodity platforms.

Abstract

Deep learning recommendation systems must provide high quality, personalized content under strict tail-latency targets and high system loads. This paper presents RecPipe, a system to jointly optimize recommendation quality and inference performance. Central to RecPipe is decomposing recommendation models into multi-stage pipelines to maintain quality while reducing compute complexity and exposing distinct parallelism opportunities. RecPipe implements an inference scheduler to map multi-stage recommendation engines onto commodity, heterogeneous platforms (e.g., CPUs, GPUs).While the hardware-aware scheduling improves ranking efficiency, the commodity platforms suffer from many limitations requiring specialized hardware. Thus, we design RecPipeAccel (RPAccel), a custom accelerator that jointly optimizes quality, tail-latency, and system throughput. RPAc-cel is designed specifically to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

harvard-acc/RecPipe
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.