Evaluate-as-Action: Self-Evaluated Process Rewards for Retrieval-Augmented Agents
Jiangming Shu, Yuxiang Zhang, Ye Ma, Xueyuan Lin, Jitao Sang

TL;DR
This paper introduces EvalAct, a method that improves retrieval-augmented agents by explicitly evaluating each retrieval step and using these evaluations to better optimize multi-step reasoning, leading to higher accuracy.
Contribution
The paper proposes EvalAct, a novel framework that converts retrieval quality assessment into explicit actions and introduces PCAR for improved optimization in retrieval-augmented agents.
Findings
EvalAct achieves state-of-the-art accuracy on seven QA benchmarks.
Significant improvements are observed on multi-hop question answering tasks.
Explicit evaluation loops and PCAR contribute to the performance gains.
Abstract
Retrieval-augmented agents can query external evidence, yet their reliability in multi-step reasoning remains limited: noisy retrieval may derail multi-hop question answering, while outcome-only reinforcement learning provides credit signals that are too coarse to optimize intermediate steps. We propose \textsc{EvalAct} (Evaluate-as-Action), which converts implicit retrieval quality assessment into an explicit action and enforces a coupled Search-to-Evaluate protocol so that each retrieval is immediately followed by a structured evaluation score, yielding process signals aligned with the interaction trajectory. To leverage these signals, we introduce Process-Calibrated Advantage Rescaling (PCAR), a GRPO-based optimization method that rescales advantages at the segment level according to evaluation scores, emphasizing reliable segments while updating uncertain ones conservatively.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · AI-based Problem Solving and Planning
