TL;DR
ReRec is a reinforcement fine-tuning framework that enhances LLMs for complex, reasoning-driven recommendation tasks by integrating reward shaping, reasoning-aware advantage estimation, and curriculum scheduling.
Contribution
It introduces a novel RFT framework with three components to improve reasoning in LLM-based recommenders, outperforming existing methods.
Findings
ReRec outperforms state-of-the-art baselines in recommendation tasks.
The framework maintains instruction-following and general knowledge abilities.
Experiments validate the effectiveness of the proposed components.
Abstract
With the rise of LLMs, there is an increasing need for intelligent recommendation assistants that can handle complex queries and provide personalized, reasoning-driven recommendations. LLM-based recommenders show potential but face challenges in multi-step reasoning, underscoring the need for reasoning-augmented systems. To address this gap, we propose ReRec, a novel reinforcement fine-tuning (RFT) framework designed to improve LLM reasoning in complex recommendation tasks. Our framework introduces three key components: (1) Dual-Graph Enhanced Reward Shaping, integrating recommendation metrics like NDCG@K with Query Alignment and Preference Alignment Scores to provide fine-grained reward signals for LLM optimization; (2) Reasoning-aware Advantage Estimation, which decomposes LLM outputs into reasoning segments and penalizes incorrect steps to enhance reasoning of recommendation; and (3)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
