Generative Recommendation for Large-Scale Advertising

Ben Xue; Dan Liu; Lixiang Wang; Mingjie Sun; Peng Wang; Pengfei Zhang; Shaoyun Shi; Tianyu Xu; Yunhao Sha; Zhiqiang Liu; Bo Kong; Bo Wang; Hang Yang; Jieting Xue; Junhao Wang; Shengyu Wang; Shuping Hui; Wencai Ye; Xiao Lin; Yongzhi Li; Yuhang Chen; Zhihui Yin; Quan Chen; Shiyang Wen; Wenjin Wu; Han Li; Guorui Zhou; Changcheng Li; Peng Jiang; Kun Gai

arXiv:2602.22732·cs.IR·April 3, 2026

Generative Recommendation for Large-Scale Advertising

Ben Xue, Dan Liu, Lixiang Wang, Mingjie Sun, Peng Wang, Pengfei Zhang, Shaoyun Shi, Tianyu Xu, Yunhao Sha, Zhiqiang Liu, Bo Kong, Bo Wang, Hang Yang, Jieting Xue, Junhao Wang, Shengyu Wang, Shuping Hui, Wencai Ye, Xiao Lin, Yongzhi Li, Yuhang Chen, Zhihui Yin, Quan Chen

PDF

TL;DR

GR4AD is a production-oriented generative recommender system for large-scale advertising, combining novel tokenization, efficient decoding, and value-aware optimization to improve revenue and scalability.

Contribution

The paper introduces GR4AD, a comprehensive generative recommendation framework tailored for large-scale advertising, with innovations in tokenization, decoding, and online optimization.

Findings

01

Achieved up to 4.2% ad revenue increase in large-scale online tests.

02

Deployed in Kuaishou's advertising system with over 400 million users.

03

Demonstrated effective scaling and real-time inference with reduced costs.

Abstract

Generative recommendation has recently attracted widespread attention in industry due to its potential for scaling and stronger model capacity. However, deploying real-time generative recommendation in large-scale advertising requires designs beyond large-language-model (LLM)-style training and serving recipes. We present a production-oriented generative recommender co-designed across architecture, learning, and serving, named GR4AD (Generative Recommendation for ADdvertising). As for tokenization, GR4AD proposes UA-SID (Unified Advertisement Semantic ID) to capture complicated business information. Furthermore, GR4AD introduces LazyAR, a lazy autoregressive decoder that relaxes layer-wise dependencies for short, multi-candidate generation, preserving effectiveness while reducing inference cost, which facilitates scaling under fixed serving budgets. To align optimization with business…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.