LONGER: Scaling Up Long Sequence Modeling in Industrial Recommenders
Zheng Chai, Qin Ren, Xijun Xiao, Huizhi Yang, Bo Han, Sijun Zhang, Di Chen, Hui Lu, Wenlin Zhao, Lele Yu, Xionghang Xie, Shiru Ren, Xiang Sun, Yaocheng Tan, Peng Xu, Yuchao Zheng, Di Wu

TL;DR
LONGER is a scalable transformer model designed for ultra-long user behavior sequences, improving efficiency and effectiveness in industrial recommender systems through innovative attention stabilization, token merging, and engineering optimizations.
Contribution
The paper introduces LONGER, a novel long-sequence transformer with global token mechanisms, hybrid attention, and engineering optimizations, enabling efficient industrial-scale recommender systems.
Findings
Outperforms strong baselines in offline metrics
Demonstrates significant improvements in online A/B tests
Successfully deployed in over 10 scenarios serving billions of users
Abstract
Modeling ultra-long user behavior sequences is critical for capturing both long- and short-term preferences in industrial recommender systems. Existing solutions typically rely on two-stage retrieval or indirect modeling paradigms, incuring upstream-downstream inconsistency and computational inefficiency. In this paper, we present LONGER, a Long-sequence Optimized traNsformer for GPU-Efficient Recommenders. LONGER incorporates (i) a global token mechanism for stabilizing attention over long contexts, (ii) a token merge module with lightweight InnerTransformers and hybrid attention strategy to reduce quadratic complexity, and (iii) a series of engineering optimizations, including training with mixed-precision and activation recomputation, KV cache serving, and the fully synchronous model training and serving framework for unified GPU-based dense and sparse parameter updates. LONGER…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Information Retrieval and Search Behavior · Advanced Bandit Algorithms Research
MethodsSoftmax · Attention Is All You Need
