MixFormer: Co-Scaling Up Dense and Sequence in Industrial Recommenders
Xu Huang, Hao Zhang, Zhifang Fan, Yunwen Huang, Zhuoxing Wei, Zheng Chai, Jinan Ni, Yuchao Zheng, Qiwei Chen

TL;DR
MixFormer is a unified Transformer architecture for industrial recommender systems that jointly models sequence behaviors and feature interactions, enabling better co-scaling, expressiveness, and efficiency.
Contribution
The paper introduces MixFormer, a novel unified Transformer model that co-scales dense features and sequences, addressing the limitations of decoupled designs in recommendation systems.
Findings
Outperforms existing models in accuracy on large-scale datasets.
Reduces inference latency through user-item decoupling strategies.
Achieves significant improvements in user engagement metrics in online tests.
Abstract
As industrial recommender systems enter a scaling-driven regime, Transformer architectures have become increasingly attractive for scaling models towards larger capacity and longer sequence. However, existing Transformer-based recommendation models remain structurally fragmented, where sequence modeling and feature interaction are implemented as separate modules with independent parameterization. Such designs introduce a fundamental co-scaling challenge, as model capacity must be suboptimally allocated between dense feature interaction and sequence modeling under a limited computational budget. In this work, we propose MixFormer, a unified Transformer-style architecture tailored for recommender systems, which jointly models sequential behaviors and feature interactions within a single backbone. Through a unified parameterization, MixFormer enables effective co-scaling across both dense…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Explainable Artificial Intelligence (XAI) · Big Data and Digital Economy
