Adaptive Dense-to-Sparse Paradigm for Pruning Online Recommendation System with Non-Stationary Data
Mao Ye, Dhruv Choudhary, Jiecao Yu, Ellie Wen, Zeliang Chen, Jiyan, Yang, Jongsoo Park, Qiang Liu, Arun Kejariwal

TL;DR
This paper introduces an adaptive dense-to-sparse pruning paradigm with a novel algorithm for large-scale online recommendation systems, effectively handling non-stationary data distribution while reducing computational costs.
Contribution
It proposes the first in-depth analysis of pruning in non-stationary recommendation systems and introduces an automatic, layer-wise sparsity learning algorithm for heterogeneous architectures.
Findings
Effective pruning reduces model size and inference cost.
Adaptive pruning maintains accuracy under data distribution shifts.
Automatic sparsity learning eliminates manual tuning efforts.
Abstract
Large scale deep learning provides a tremendous opportunity to improve the quality of content recommendation systems by employing both wider and deeper models, but this comes at great infrastructural cost and carbon footprint in modern data centers. Pruning is an effective technique that reduces both memory and compute demand for model inference. However, pruning for online recommendation systems is challenging due to the continuous data distribution shift (a.k.a non-stationary data). Although incremental training on the full model is able to adapt to the non-stationary data, directly applying it on the pruned model leads to accuracy loss. This is because the sparsity pattern after pruning requires adjustment to learn new patterns. To the best of our knowledge, this is the first work to provide in-depth analysis and discussion of applying pruning to online recommendation systems with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Stochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis
MethodsPruning
