Action-Aware Generative Sequence Modeling for Short Video Recommendation

Wenhao Li; Zihan Lin; Zhengxiao Guo; Jie Zhou; Shukai Liu; Yongqi Liu; Chuan Luo; Chaoyi Ma; Ruiming Tang; Han Li

arXiv:2604.25834·cs.AI·April 29, 2026

Action-Aware Generative Sequence Modeling for Short Video Recommendation

Wenhao Li, Zihan Lin, Zhengxiao Guo, Jie Zhou, Shukai Liu, Yongqi Liu, Chuan Luo, Chaoyi Ma, Ruiming Tang, Han Li

PDF

TL;DR

This paper introduces A2Gen, a novel sequence modeling approach that captures user action timing and context to improve short video recommendations, demonstrating significant online performance gains.

Contribution

The paper proposes a new action-aware generative sequence model with modules for contextual attention, hierarchical encoding, and sequence generation, enhancing recommendation accuracy.

Findings

01

Model outperforms baselines on Kuaishou and Tmall datasets.

02

Achieves 0.34% increase in user watch time.

03

Online deployment improves user engagement metrics.

Abstract

With the rapid development of the Internet, users have increasingly higher expectations for the recommendation accuracy of online content consumption platforms. However, short videos often contain diverse segments, and users may not hold the same attitude toward all of them. Traditional binary-classification recommendation models, which treat a video as a single holistic entity, face limitations in accurately capturing such nuanced preferences. Considering that user consumption is a temporal process, this paper demonstrates that the timing of user actions can represent diverse intentions through statistical analysis and examination of action patterns. Based on this insight, we propose a novel modeling paradigm: Action-Aware Generative Sequence Network (A2Gen), which refines user actions along the temporal dimension and chains them into sequences for unified processing and prediction.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.