Reasoning-Based Personalized Generation for Users with Sparse Data
Bo Ni, Branislav Kveton, Samyadeep Basu, Subhojyoti Mukherjee, Leyao Wang, Franck Dernoncourt, Sungchul Kim, Seunghyun Yoon, Zichao Wang, Ruiyi Zhang, Puneet Mathur, Jihyung Kil, Jiuxiang Gu, Nedim Lipka, Yu Wang, Ryan A. Rossi, Tyler Derr

TL;DR
This paper introduces GraSPer, a graph-based framework that enhances personalized text generation for users with limited interaction data by augmenting context and aligning reasoning, leading to improved personalization performance.
Contribution
The paper proposes GraSPer, a novel reasoning-based framework that effectively addresses personalization for sparse user data by augmenting context and generating aligned personalized responses.
Findings
Significant performance improvements on benchmark datasets.
Effective augmentation of user context with synthetic interactions.
Enhanced personalization in sparse data scenarios.
Abstract
Large Language Model (LLM) personalization holds great promise for tailoring responses by leveraging personal context and history. However, real-world users usually possess sparse interaction histories with limited personal context, such as cold-start users in social platforms and newly registered customers in online E-commerce platforms, compromising the LLM-based personalized generation. To address this challenge, we introduce GraSPer (Graph-based Sparse Personalized Reasoning), a novel framework for enhancing personalized text generation under sparse context. GraSPer first augments user context by predicting items that the user would likely interact with in the future. With reasoning alignment, it then generates texts for these interactions to enrich the augmented context. In the end, it generates personalized outputs conditioned on both the real and synthetic histories, ensuring…
Peer Reviews
Decision·Submitted to ICLR 2026
1.By integrating graph-based expansion with reasoning-aligned fine-tuning, GRASPER provides a coherent pipeline that balances coverage (via graph expansion) and precision (via reasoning alignment). The modular design makes it interpretable and extendable to different backbone models. 2.Experiments cover multiple datasets, tasks, and backbones, demonstrating robustness and steady performance gains. Both automatic metrics (ROUGE, METEOR, MAE/RMSE) and LLM-as-a-Judge evaluations support the claimed
1.Limited conceptual novelty (A+B composition): The proposed framework essentially combines two established ideas — graph-based personalization for sparse users and reasoning-based generation alignment — into a single pipeline. Both components have been independently explored in prior works (e.g., Au et al., 2025; Salemi et al., 2025). While the integration is well-executed and empirically validated, the conceptual novelty is limited. The paper contributes mainly at the system level rather than
1. The paper tackles a practical problem of personalization under sparse user data, which is common in real-world applications. 2. The proposed GRASPER framework is conceptually clear, combining user history augmentation with reasoning-based generation to mitigate the potential noise from predicted items.
1. For experimental results in Tables 1-3, it is not clear to me how GRASPER is applied with GPT-4o mini. As described in Section 3.2, GRASPER requires fine-tuning the LLM for reasoning alignment. 2. In the main paper, it is not clear how sparse the benchmarks are (only some statistics are provided in the Appendix), and there is no analysis on how robust the method is to different sparsity levels. 3. For equation (9), the loss only involves $t_{u, j}$, however it is mentioned that the model is
1. The focus on sparse user personalization aligns well with real-world deployment scenarios. 2. GRASPER combines graph-based context expansion with explicit, trainable reasoning paths for personalized generation. 3. The experiments are conducted on three benchmarks and evaluated under conventional NLP metrics and under LLM-as-a-Judge evaluation.
1. GRASPER requires training a graph encoder and fine-tuning an LLM with multi-stage prompting. This may increase computational cost compared to prompt-only or retrieval-only baselines like LaMP or PGraph. 2. The method relies on generating faithful synthetic reviews for predicted user–item interactions. Errors in link prediction or reasoning could mislead the final generation. While ablations show robustness, the quality of synthetic data is not assessed. 3. The procedure for selecting the “gol
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Topic Modeling · Advanced Graph Neural Networks
