IB-GRPO: Aligning LLM-based Learning Path Recommendation with Educational Objectives via Indicator-Based Group Relative Policy Optimization
Shuai Wang, Yaoming Yang, Bingdong Li, Hao Hao, Aimin Zhou

TL;DR
This paper introduces IB-GRPO, a novel method that aligns large language model-based learning path recommendations with educational goals by using indicator-guided optimization and hybrid expert demonstrations.
Contribution
The paper proposes IB-GRPO, which effectively aligns LLM recommendations with pedagogical objectives using indicator-based optimization and hybrid demonstrations, addressing data scarcity and multi-objective trade-offs.
Findings
IB-GRPO outperforms RL and LLM baselines on ASSIST09 and Junyi datasets.
The method improves long-term learning effect and pedagogical alignment.
It demonstrates effective multi-objective optimization without manual scalarization.
Abstract
Learning Path Recommendation (LPR) aims to generate personalized sequences of learning items that maximize long-term learning effect while respecting pedagogical principles and operational constraints. Although large language models (LLMs) offer rich semantic understanding for free-form recommendation, applying them to long-horizon LPR is challenging due to (i) misalignment with pedagogical objectives such as the Zone of Proximal Development (ZPD) under sparse, delayed feedback, (ii) scarce and costly expert demonstrations, and (iii) multi-objective interactions among learning effect, difficulty scheduling, length controllability, and trajectory diversity. To address these issues, we propose IB-GRPO (Indicator-Based Group Relative Policy Optimization), an indicator-guided alignment approach for LLM-based LPR. To mitigate data scarcity, we construct hybrid expert demonstrations via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Recommender Systems and Techniques · Topic Modeling
