TokenFormer: Unify the Multi-Field and Sequential Recommendation Worlds
Yifeng Zhou, Yuehong Hu, Zhixiang Feng, Junwei Pan, Kaihui Wu, Hanyong Li, Shangyu Zhang, Shudong Huang, Zhangbin Zhu, Chengguo Yin, Haijie Gu, Jie Jiang

TL;DR
TokenFormer is a unified recommendation model that effectively combines multi-field feature interactions and sequential user behavior modeling, overcoming previous integration challenges.
Contribution
It introduces a novel attention scheme and non-linear interaction method to unify multi-field and sequential recommendation paradigms.
Findings
Achieves state-of-the-art performance on public benchmarks and Tencent's advertising platform.
Significantly improves dimensional robustness and representation discriminability.
Addresses the Sequential Collapse Propagation issue in unified models.
Abstract
Recommender systems have historically developed along two largely independent paradigms: feature interaction models for modeling correlations among multi-field categorical features, and sequential models for capturing user behavior dynamics from historical interaction sequences. Although recent trends attempt to bridge these paradigms within shared backbones, we empirically reveal that naive unifying these two branches may lead to a failure mode of Sequential Collapse Propagation (SCP). That is, the interaction with those dimensionally ill non-sequence fields leads to the dimensional collapse of the sequence features. To overcome this challenge, we propose TokenFormer, a unified recommendation architecture with the following innovations. First, we introduce a Bottom-Full-Top-Sliding (BFTS) attention scheme, which applies full self-attention in the lower layers and shrinking-window…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
