Centaur: A Chiplet-based, Hybrid Sparse-Dense Accelerator for   Personalized Recommendations

Ranggi Hwang; Taehun Kim; Youngeun Kwon; Minsoo Rhu

arXiv:2005.05968·cs.DC·May 14, 2020·6 cites

Centaur: A Chiplet-based, Hybrid Sparse-Dense Accelerator for Personalized Recommendations

Ranggi Hwang, Taehun Kim, Youngeun Kwon, Minsoo Rhu

PDF

Open Access

TL;DR

Centaur is a chiplet-based accelerator designed to efficiently handle both sparse embedding and dense MLP layers in personalized recommendation ML workloads, achieving significant speedup and energy efficiency improvements.

Contribution

It introduces a hybrid sparse-dense accelerator architecture tailored for recommendation workloads, addressing memory and compute bottlenecks.

Findings

01

Achieves 1.7-17.2x performance speedup

02

Realizes 1.7-19.5x energy-efficiency improvements

03

Effectively accelerates both embedding and MLP layers in recommendation models

Abstract

Personalized recommendations are the backbone machine learning (ML) algorithm that powers several important application domains (e.g., ads, e-commerce, etc) serviced from cloud datacenters. Sparse embedding layers are a crucial building block in designing recommendations yet little attention has been paid in properly accelerating this important ML algorithm. This paper first provides a detailed workload characterization on personalized recommendations and identifies two significant performance limiters: memory-intensive embedding layers and compute-intensive multi-layer perceptron (MLP) layers. We then present Centaur, a chiplet-based hybrid sparse-dense accelerator that addresses both the memory throughput challenges of embedding layers and the compute limitations of MLP layers. We implement and demonstrate our proposal on an Intel HARPv2, a package-integrated CPU+FPGA device, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Recommender Systems and Techniques · Stochastic Gradient Optimization Techniques