Mixture of Sequence: Theme-Aware Mixture-of-Experts for Long-Sequence Recommendation

Xiao Lin; Zhicheng Tang; Weilin Cong; Mengyue Hang; Kai Wang; Yajuan Wang; Zhichen Zeng; Ting-Wei Li; Hyunsik Yoo; Zhining Liu; Xuying Ning; Ruizhong Qiu; Wen-yen Chen; Shuo Chang; Rong Jin; Huayu Li; Hanghang Tong

arXiv:2604.20858·cs.IR·April 24, 2026

Mixture of Sequence: Theme-Aware Mixture-of-Experts for Long-Sequence Recommendation

Xiao Lin, Zhicheng Tang, Weilin Cong, Mengyue Hang, Kai Wang, Yajuan Wang, Zhichen Zeng, Ting-Wei Li, Hyunsik Yoo, Zhining Liu, Xuying Ning, Ruizhong Qiu, Wen-yen Chen, Shuo Chang, Rong Jin, Huayu Li, Hanghang Tong

PDF

1 Repo

TL;DR

The paper introduces MoS, a theme-aware mixture-of-experts framework for long-sequence recommendation that effectively filters irrelevant information and captures multi-scale user behaviors, achieving state-of-the-art results.

Contribution

It proposes a novel MoE-based model with theme-aware routing and multi-scale fusion to improve long-sequence recommendations by handling interest shifts and noisy data.

Findings

01

MoS outperforms existing models with state-of-the-art accuracy.

02

MoS achieves this with fewer FLOPs, indicating efficiency.

03

The framework effectively filters irrelevant information in long sequences.

Abstract

Sequential recommendation has rapidly advanced in click-through rate prediction due to its ability to model dynamic user interests. A key challenge, however, lies in modeling long sequences: users often exhibit significant interest shifts, introducing substantial irrelevant or misleading information. Our empirical analysis corroborates this challenge and uncovers a recurring behavioral pattern in long sequences (\textit{session hopping}): user interests remain stable within short temporal spans (\textit{sessions}) but shift drastically across sessions and may reappear after multiple sessions. To address this challenge, we propose the Mixture of Sequence (MoS) framework, a model-agnostic MoE approach that achieves accurate predictions by extracting theme-specific and multi-scale subsequences from noisy raw user sequences. First, MoS employs a theme-aware routing mechanism to adaptively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xiaolin-cs/MoS
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.