MiniRec: Data-Efficient Reinforcement Learning for LLM-based Recommendation

Lin Wang; Yang Zhang; Jingfan Chen; Xiaoyan Zhao; Fengbin Zhu; Qing Li; Tat-Seng Chua

arXiv:2602.04278·cs.IR·February 5, 2026

MiniRec: Data-Efficient Reinforcement Learning for LLM-based Recommendation

Lin Wang, Yang Zhang, Jingfan Chen, Xiaoyan Zhao, Fengbin Zhu, Qing Li, Tat-Seng Chua

PDF

Open Access

TL;DR

MiniRec introduces a reward-based, trajectory-aligned data selection method for RL-enhanced LLM recommendation systems, significantly reducing training costs while maintaining high performance.

Contribution

It presents MiniRec, a novel data selection framework that aligns sample choice with RL signals and optimization trajectories, improving efficiency in RL-based LLM recommendation.

Findings

01

Reduces training cost by up to 50%

02

Maintains recommendation performance with fewer samples

03

Highlights importance of reward-aligned data selection

Abstract

The integration of reinforcement learning (RL) into large language models (LLMs) has opened new opportunities for recommender systems by eliciting reasoning and improving user preference modeling. However, RL-based LLM recommendation faces significant efficiency challenges, making full-data training costly. Existing data selection methods define sample value based on learnability or representativeness, yet their loss- or gradient-driven or dataset coverage-driven criteria often misalign with RL learning dynamics, resulting in suboptimal performance. To address this, we propose MiniRec, a data selection framework tailored for RL-based LLM recommendation. MiniRec evaluates sample learnability using key RL signals -- rewards -- pruning samples that are too easy (too high reward) or too difficult (consistently low reward). It assesses representativeness by aligning sample gradients with the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Topic Modeling · Explainable Artificial Intelligence (XAI)