TL;DR
StackRec is a novel training framework that enables efficient and effective training of very deep sequential recommender models by iterative layer stacking, significantly reducing training time while maintaining performance.
Contribution
It introduces a layer stacking technique based on the similarity of layer distributions, allowing deep SR models to be trained more efficiently from shallower models.
Findings
Achieves comparable recommendation accuracy to traditional training methods.
Reduces training time substantially for very deep models.
Applicable to multiple state-of-the-art SR architectures.
Abstract
Deep learning has brought great progress for the sequential recommendation (SR) tasks. With advanced network architectures, sequential recommender models can be stacked with many hidden layers, e.g., up to 100 layers on real-world recommendation datasets. Training such a deep network is difficult because it can be computationally very expensive and takes much longer time, especially in situations where there are tens of billions of user-item interactions. To deal with such a challenge, we present StackRec, a simple, yet very effective and efficient training framework for deep SR models by iterative layer stacking. Specifically, we first offer an important insight that hidden layers/blocks in a well-trained deep SR model have very similar distributions. Enlightened by this, we propose the stacking operation on the pre-trained layers/blocks to transfer knowledge from a shallower model to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
