Enabling Self-Improving Agents to Learn at Test Time With Human-In-The-Loop Guidance

Yufei He; Ruoyu Li; Alex Chen; Yue Liu; Yulin Chen; Yuan Sui; Cheng Chen; Yi Zhu; Luca Luo; Frank Yang; Bryan Hooi

arXiv:2507.17131·cs.LG·October 13, 2025

Enabling Self-Improving Agents to Learn at Test Time With Human-In-The-Loop Guidance

Yufei He, Ruoyu Li, Alex Chen, Yue Liu, Yulin Chen, Yuan Sui, Cheng Chen, Yi Zhu, Luca Luo, Frank Yang, Bryan Hooi

PDF

Open Access 1 Video

TL;DR

This paper introduces ARIA, a framework enabling large language models to learn and adapt at test time through human-in-the-loop guidance, improving performance in dynamic environments.

Contribution

The paper presents ARIA, a novel self-improving agent framework that continuously updates its knowledge during operation using structured self-dialogue and human feedback.

Findings

01

ARIA outperforms baseline methods in adaptability and accuracy.

02

Effective knowledge updating reduces errors in dynamic tasks.

03

Deployed in TikTok Pay, ARIA handles over 150 million users effectively.

Abstract

Large language model (LLM) agents often struggle in environments where rules and required domain knowledge frequently change, such as regulatory compliance and user risk screening. Current approaches, like offline fine-tuning and standard prompting, are insufficient because they cannot effectively adapt to new knowledge during actual operation. To address this limitation, we propose the Adaptive Reflective Interactive Agent (ARIA), an LLM agent framework designed specifically to continuously learn updated domain knowledge at test time. ARIA assesses its own uncertainty through structured self-dialogue, proactively identifying knowledge gaps and requesting targeted explanations or corrections from human experts. It then systematically updates an internal, timestamped knowledge repository with provided human guidance, detecting and resolving conflicting or outdated knowledge through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Enabling Self-Improving Agents to Learn at Test Time With Human-In-The-Loop Guidance· underline

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Human-Automation Interaction and Safety