Semantic Trimming and Auxiliary Multi-step Prediction for Generative Recommendation

Tianyu Zhan; Kairui Fu; Chengfei Lv; Zheqi Lv; Shengyu Zhang

arXiv:2604.05329·cs.IR·April 8, 2026

Semantic Trimming and Auxiliary Multi-step Prediction for Generative Recommendation

Tianyu Zhan, Kairui Fu, Chengfei Lv, Zheqi Lv, Shengyu Zhang

PDF

TL;DR

This paper introduces STAMP, a framework that improves generative recommendation by reducing redundancy and enhancing learning signals through semantic trimming and multi-step prediction, leading to faster training and lower memory use.

Contribution

STAMP combines semantic adaptive pruning and multi-step auxiliary prediction to address semantic dilution, boosting efficiency and robustness in SID-based generative recommendation models.

Findings

01

Achieves 1.23--1.38× speedup in training

02

Reduces VRAM usage by 17.2%--54.7%

03

Maintains or improves recommendation performance

Abstract

Generative Recommendation (GR) has recently transitioned from atomic item-indexing to Semantic ID (SID)-based frameworks to capture intrinsic item relationships and enhance generalization. However, the adoption of high-granularity SIDs leads to two critical challenges: prohibitive training overhead due to sequence expansion and unstable performance reliability characterized by non-monotonic accuracy fluctuations. We identify that these disparate issues are fundamentally rooted in the Semantic Dilution Effect, where redundant tokens waste massive computation and dilute the already sparse learning signals in recommendation. To counteract this, we propose STAMP (Semantic Trimming and Auxiliary Multi-step Prediction), a framework utilizing a dual-end optimization strategy. We argue that effective SID learning requires simultaneously addressing low input information density and sparse output…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.