GRACE: Generative Recommendation via Journey-Aware Sparse Attention on Chain-of-Thought Tokenization

Luyi Ma; Wanjia Zhang; Kai Zhao; Abhishek Kulkarni; Lalitesh Morishetti; Anjana Ganesh; Ashish Ranjan; Aashika Padmanabhan; Jianpeng Xu; Jason Cho; Praveen Kanumala; Kaushiki Nag; Sumit Dutta; Kamiya Motwani; Malay Patel; Evren Korpeoglu; Sushant Kumar; Kannan Achan

arXiv:2507.14758·cs.CL·July 22, 2025

GRACE: Generative Recommendation via Journey-Aware Sparse Attention on Chain-of-Thought Tokenization

Luyi Ma, Wanjia Zhang, Kai Zhao, Abhishek Kulkarni, Lalitesh Morishetti, Anjana Ganesh, Ashish Ranjan, Aashika Padmanabhan, Jianpeng Xu, Jason Cho, Praveen Kanumala, Kaushiki Nag, Sumit Dutta, Kamiya Motwani, Malay Patel, Evren Korpeoglu, Sushant Kumar, Kannan Achan

PDF

TL;DR

GRACE is a novel generative recommendation framework that uses journey-aware sparse attention and explicit product attribute encoding to improve multi-behavior sequence modeling, interpretability, and computational efficiency.

Contribution

It introduces a hybrid Chain-of-Thought tokenization with explicit attributes and a journey-aware sparse attention mechanism for efficient, interpretable recommendations.

Findings

01

Outperforms state-of-the-art baselines by up to 106.9% in HR@10.

02

Reduces attention computation by up to 48%.

03

Achieves significant improvements in recommendation accuracy on real-world datasets.

Abstract

Generative models have recently demonstrated strong potential in multi-behavior recommendation systems, leveraging the expressive power of transformers and tokenization to generate personalized item sequences. However, their adoption is hindered by (1) the lack of explicit information for token reasoning, (2) high computational costs due to quadratic attention complexity and dense sequence representations after tokenization, and (3) limited multi-scale modeling over user history. In this work, we propose GRACE (Generative Recommendation via journey-aware sparse Attention on Chain-of-thought tokEnization), a novel generative framework for multi-behavior sequential recommendation. GRACE introduces a hybrid Chain-of-Thought (CoT) tokenization method that encodes user-item interactions with explicit attributes from product knowledge graphs (e.g., category, brand, price) over semantic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.