Reinforced Agent: Inference-Time Feedback for Tool-Calling Agents

Anh Ta; Junjie Zhu; Shahin Shayandeh

arXiv:2604.27233·cs.AI·May 1, 2026

Reinforced Agent: Inference-Time Feedback for Tool-Calling Agents

Anh Ta, Junjie Zhu, Shahin Shayandeh

PDF

TL;DR

This paper introduces a real-time inference-time feedback mechanism for tool-using agents, employing a secondary reviewer agent to evaluate and correct tool calls before execution, thereby improving accuracy and robustness.

Contribution

It proposes a novel multi-agent architecture with a dedicated review agent for proactive error mitigation during inference, moving beyond traditional post-hoc evaluation methods.

Findings

01

Achieved +5.5% on irrelevance detection

02

Achieved +7.1% on multi-turn tasks

03

Reviewer model choice significantly impacts performance

Abstract

Tool-calling agents are evaluated on tool selection, parameter accuracy, and scope recognition, yet LLM trajectory assessments remain inherently post-hoc. Disconnected from the active execution loop, such assessments identify errors that are usually addressed through prompt-tuning or retraining, and fundamentally cannot course-correct the agent in real time. To close this gap, we move evaluation into the execution loop at inference time: a specialized reviewer agent evaluates provisional tool calls prior to execution, shifting the paradigm from post-hoc recovery to proactive evaluation and error mitigation. In practice, this architecture establishes a clear separation of concerns between the primary execution agent and a secondary review agent. As with any multi-agent system, the reviewer can introduce new errors while correcting others, yet no prior work to our knowledge has…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.