On the Equivalence Between Auto-Regressive Next Token Prediction and Full-Item-Vocabulary Maximum Likelihood Estimation in Generative Recommendation--A Short Note

Yusheng Huang; Shuang Yang; Zhaojie Liu; Han Li

arXiv:2604.15739·cs.IR·April 20, 2026

On the Equivalence Between Auto-Regressive Next Token Prediction and Full-Item-Vocabulary Maximum Likelihood Estimation in Generative Recommendation--A Short Note

Yusheng Huang, Shuang Yang, Zhaojie Liu, Han Li

PDF

TL;DR

This paper proves that auto-regressive next-token prediction in generative recommendation systems is mathematically equivalent to full-item-vocabulary maximum likelihood estimation, providing a theoretical foundation for industrial practices.

Contribution

It establishes a formal proof of the equivalence between AR-NTP and FV-MLE in generative recommendation, applicable to common tokenization schemes.

Findings

01

Proves the strict mathematical equivalence between AR-NTP and FV-MLE.

02

Shows the equivalence holds for both cascaded and parallel tokenizations.

03

Provides a theoretical basis for optimizing industrial generative recommendation systems.

Abstract

Generative recommendation (GR) has emerged as a widely adopted paradigm in industrial sequential recommendation. Current GR systems follow a similar pipeline: tokenization for item indexing, next-token prediction as the training objective and auto-regressive decoding for next-item generation. However, existing GR research mainly focuses on architecture design and empirical performance optimization, with few rigorous theoretical explanations for the working mechanism of auto-regressive next-token prediction in recommendation scenarios. In this work, we formally prove that \textbf{the k-token auto-regressive next-token prediction (AR-NTP) paradigm is strictly mathematically equivalent to full-item-vocabulary maximum likelihood estimation (FV-MLE)}, under the core premise of a bijective mapping between items and their corresponding k-token sequences. We further show that this equivalence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.