Understanding Scaling Laws for Recommendation Models
Newsha Ardalani, Carole-Jean Wu, Zeliang Chen, Bhargav Bhushanam,, Adnan Aziz

TL;DR
This paper investigates empirical scaling laws for recommendation models, revealing how model quality improves with size and compute, and highlighting data scaling as the most promising path forward until better architectures emerge.
Contribution
It characterizes the scaling behavior of recommendation models, especially CTR, and compares data, parameter, and compute scaling, providing insights for sustainable model development.
Findings
Model quality scales with power law plus constant in size, data, and compute.
Parameter scaling shows diminishing returns for the studied architecture.
Data scaling remains effective until new architectures improve performance.
Abstract
Scale has been a major driving force in improving machine learning performance, and understanding scaling laws is essential for strategic planning for a sustainable model quality performance growth, long-term resource planning and developing efficient system infrastructures to support large-scale models. In this paper, we study empirical scaling laws for DLRM style recommendation models, in particular Click-Through Rate (CTR). We observe that model quality scales with power law plus constant in model size, data size and amount of compute used for training. We characterize scaling efficiency along three different resource dimensions, namely data, parameters and compute by comparing the different scaling schemes along these axes. We show that parameter scaling is out of steam for the model architecture under study, and until a higher-performing model architecture emerges, data scaling is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Data Stream Mining Techniques · Stock Market Forecasting Methods
