TreePS-RAG: Tree-based Process Supervision for Reinforcement Learning in Agentic RAG
Tianhua Zhang, Kun Li, Junan Li, Yunxiang Li, Hongyin Luo, Xixin Wu, James Glass, Helen Meng

TL;DR
TreePS-RAG introduces a tree-based reinforcement learning framework for agentic retrieval-augmented generation, enabling step-wise credit assignment and improved performance on QA benchmarks without requiring intermediate annotations.
Contribution
It proposes an online, tree-structured RL approach for agentic RAG that models reasoning as a rollout tree, improving credit assignment and performance without costly intermediate supervision.
Findings
Outperforms outcome-supervised RL methods on multiple QA benchmarks.
Efficient online tree construction preserves exploration diversity.
Achieves comparable rollout costs to strong baselines.
Abstract
Agentic retrieval-augmented generation (RAG) formulates question answering as a multi-step interaction between reasoning and information retrieval, and has recently been advanced by reinforcement learning (RL) with outcome-based supervision. While effective, relying solely on sparse final rewards limits step-wise credit assignment and provides weak guidance for intermediate reasoning and actions. Recent efforts explore process-level supervision, but typically depend on offline constructed training data, which risks distribution shift, or require costly intermediate annotations. We present TreePS-RAG, an online, tree-based RL framework for agentic RAG that enables step-wise credit assignment while retaining standard outcome-only rewards. Our key insight is to model agentic RAG reasoning as a rollout tree, where each reasoning step naturally maps to a node. This tree structure allows step…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Reinforcement Learning in Robotics
