SORT: A Systematically Optimized Ranking Transformer for Industrial-scale Recommenders
Chunqi Wang, Bingchao Wu, Taotian Pang, Jiahao Wang, Jie Yang, Jia Liu, Hao Zhang, Hai Zhu, Lei Shen, Shizhun Wang, Bing Wang, Xiaoyi Zeng

TL;DR
SORT is a novel Transformer-based ranking model tailored for industrial-scale recommender systems, effectively addressing feature sparsity and low label density, and demonstrating significant improvements in business metrics and system efficiency.
Contribution
The paper introduces SORT, a scalable and optimized Transformer architecture for industrial ranking, with novel techniques to handle sparsity, stabilize training, and improve hardware efficiency.
Findings
Outperforms strong baselines in scalability and accuracy.
Achieves significant business metric improvements in e-commerce.
Halves latency and doubles throughput in deployment.
Abstract
While Transformers have achieved remarkable success in LLMs through superior scalability, their application in industrial-scale ranking models remains nascent, hindered by the challenges of high feature sparsity and low label density. In this paper, we propose SORT (Systematically Optimized Ranking Transformer), a scalable model designed to bridge the gap between Transformers and industrial-scale ranking models. We address the high feature sparsity and low label density challenges through a series of optimizations, including request-centric sample organization, local attention, query pruning and generative pre-training. Furthermore, we introduce a suite of refinements to the tokenization, multi-head attention (MHA), and feed-forward network (FFN) modules, which collectively stabilize the training process and enlarge the model capacity. To maximize hardware efficiency, we optimize our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Text and Document Classification Technologies · Sentiment Analysis and Opinion Mining
