TreePS-RAG: Tree-based Process Supervision for Reinforcement Learning in Agentic RAG

Tianhua Zhang; Kun Li; Junan Li; Yunxiang Li; Hongyin Luo; Xixin Wu; James Glass; Helen Meng

arXiv:2601.06922·cs.CL·January 13, 2026

TreePS-RAG: Tree-based Process Supervision for Reinforcement Learning in Agentic RAG

Tianhua Zhang, Kun Li, Junan Li, Yunxiang Li, Hongyin Luo, Xixin Wu, James Glass, Helen Meng

PDF

Open Access

TL;DR

TreePS-RAG introduces a tree-based reinforcement learning framework for agentic retrieval-augmented generation, enabling step-wise credit assignment and improved performance on QA benchmarks without requiring intermediate annotations.

Contribution

It proposes an online, tree-structured RL approach for agentic RAG that models reasoning as a rollout tree, improving credit assignment and performance without costly intermediate supervision.

Findings

01

Outperforms outcome-supervised RL methods on multiple QA benchmarks.

02

Efficient online tree construction preserves exploration diversity.

03

Achieves comparable rollout costs to strong baselines.

Abstract

Agentic retrieval-augmented generation (RAG) formulates question answering as a multi-step interaction between reasoning and information retrieval, and has recently been advanced by reinforcement learning (RL) with outcome-based supervision. While effective, relying solely on sparse final rewards limits step-wise credit assignment and provides weak guidance for intermediate reasoning and actions. Recent efforts explore process-level supervision, but typically depend on offline constructed training data, which risks distribution shift, or require costly intermediate annotations. We present TreePS-RAG, an online, tree-based RL framework for agentic RAG that enables step-wise credit assignment while retaining standard outcome-only rewards. Our key insight is to model agentic RAG reasoning as a rollout tree, where each reasoning step naturally maps to a node. This tree structure allows step…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Reinforcement Learning in Robotics