Expressiveness Limits of Autoregressive Semantic ID Generation in Generative Recommendation

Yupeng Hou; Haven Kim; Clark Mingxuan Ju; Eduardo Escoto; Neil Shah; Julian McAuley

arXiv:2605.06331·cs.IR·May 8, 2026

Expressiveness Limits of Autoregressive Semantic ID Generation in Generative Recommendation

Yupeng Hou, Haven Kim, Clark Mingxuan Ju, Eduardo Escoto, Neil Shah, Julian McAuley

PDF

1 Repo

TL;DR

This paper investigates the limitations of autoregressive generative recommendation models caused by their structured decoding space and proposes a modification called Latte to improve their expressiveness.

Contribution

The paper reveals how tree-structured decoding constrains model expressiveness and introduces Latte, a simple method that reshapes the decoding space to enhance recommendation performance.

Findings

01

Structural correlations in autoregressive models hinder distinguishing similar items.

02

Latte improves NDCG@10 by an average of 3.45%.

03

Reshaping decoding trees relaxes probability coupling among items.

Abstract

Generative recommendation (GR) models generate items by autoregressively producing a sequence of discrete tokens that jointly index the target item. However, this autoregressive generation process also induces a structured decoding space whose impact on model expressiveness remains underexplored. Specifically, token-by-token generation can be viewed as traversing a decoding tree induced by semantic ID tokens, where leaf nodes correspond to candidate items. We observe that the item probabilities produced by GR models are strongly correlated with this tree structure: items that are close in the tree tend to receive similar probabilities for any given user, making it difficult to distinguish among them based on user-specific preferences. We further show theoretically that such structural correlations prevent GR models from representing even simple patterns that can be well captured by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hyp1231/Latte
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.