MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation
Samuel Hsia, Udit Gupta, Bilge Acun, Newsha Ardalani, Pan Zhong,, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

TL;DR
This paper introduces MP-Rec, a hardware-software co-design approach that dynamically selects embedding representations and hardware platforms to significantly improve the performance and quality of deep learning recommendation systems.
Contribution
It proposes a hybrid embedding representation and a co-design technique that exploits hardware heterogeneity for optimized recommendation system performance.
Findings
Achieves 16.65x performance speedup with custom accelerators.
Improves prediction throughput by 2.49x and 3.76x on real hardware.
Enhances model quality with marginal accuracy improvements.
Abstract
Deep learning recommendation systems serve personalized content under diverse tail-latency targets and input-query loads. In order to do so, state-of-the-art recommendation models rely on terabyte-scale embedding tables to learn user preferences over large bodies of contents. The reliance on a fixed embedding representation of embedding tables not only imposes significant memory capacity and bandwidth requirements but also limits the scope of compatible system solutions. This paper challenges the assumption of fixed embedding representations by showing how synergies between embedding representations and hardware platforms can lead to improvements in both algorithmic- and system performance. Based on our characterization of various embedding representations, we propose a hybrid embedding representation that achieves higher quality embeddings at the cost of increased memory and compute…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Caching and Content Delivery · Advanced Graph Neural Networks
