FuXi-\beta: Towards a Lightweight and Fast Large-Scale Generative Recommendation Model
Yufei Ye, Wei Guo, Hao Wang, Hong Zhu, Yuyang Ye, Yong Liu, Huifeng Guo, Ruiming Tang, Defu Lian, Enhong Chen

TL;DR
FuXi- introduces a lightweight, fast large-scale generative recommendation model that improves efficiency and performance by removing redundant attention components and employing novel attention mechanisms.
Contribution
The paper proposes a new framework for Transformer-like recommendation models, including the FuXi- model, with innovative attention modules that enhance speed and accuracy.
Findings
FuXi- outperforms previous models on multiple datasets.
Achieves 27-47% improvement in NDCG@10 on large-scale datasets.
Significantly accelerates training and inference while maintaining scalability.
Abstract
Scaling laws for autoregressive generative recommenders reveal potential for larger, more versatile systems but mean greater latency and training costs. To accelerate training and inference, we investigated the recent generative recommendation models HSTU and FuXi-, identifying two efficiency bottlenecks: the indexing operations in relative temporal attention bias and the computation of the query-key attention map. Additionally, we observed that relative attention bias in self-attention mechanisms can also serve as attention maps. Previous works like Synthesizer have shown that alternative forms of attention maps can achieve similar performance, naturally raising the question of whether some attention maps are redundant. Through empirical experiments, we discovered that using the query-key attention map might degrade the model's performance in recommendation tasks. To address…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
