Accelerating Robotic Reinforcement Learning with Agent Guidance

Haojun Chen; Zili Zou; Chengdong Ma; Yaoxiang Pu; Haotong Zhang; Yuanpei Chen; Yaodong Yang

arXiv:2602.11978·cs.RO·March 10, 2026

Accelerating Robotic Reinforcement Learning with Agent Guidance

Haojun Chen, Zili Zou, Chengdong Ma, Yaoxiang Pu, Haotong Zhang, Yuanpei Chen, Yaodong Yang

PDF

Open Access

TL;DR

This paper introduces AGPS, an automated framework that replaces human supervision with an intelligent agent, significantly improving sample efficiency in robotic reinforcement learning and enabling scalable, labor-free robot training.

Contribution

AGPS automates robot training by replacing human supervisors with a multimodal agent that guides exploration, enhancing scalability and efficiency in reinforcement learning.

Findings

01

AGPS outperforms human-in-the-loop methods in sample efficiency.

02

The approach enables scalable, labor-free robot learning.

03

Validated on tasks from precision insertion to deformable object manipulation.

Abstract

Reinforcement Learning (RL) offers a powerful paradigm for autonomous robots to master generalist manipulation skills through trial-and-error. However, its real-world application is stifled by low sample efficiency. Recent Human-in-the-Loop (HIL) methods accelerate training by using human corrections, yet this approach faces a scalability barrier. Reliance on human supervisors imposes a 1:1 supervision ratio that limits scalability, suffers from operator fatigue over extended sessions, and introduces high variance due to inconsistent human proficiency. We present Agent-guided Policy Search (AGPS), a framework that automates the training pipeline by replacing human supervisors with a multimodal agent. Our key insight is that the agent can be viewed as a semantic world model, injecting intrinsic value priors to structure physical exploration. By using tools, the agent provides precise…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Social Robot Interaction and HRI