Accelerating Robotic Reinforcement Learning with Agent Guidance
Haojun Chen, Zili Zou, Chengdong Ma, Yaoxiang Pu, Haotong Zhang, Yuanpei Chen, Yaodong Yang

TL;DR
This paper introduces AGPS, an automated framework that replaces human supervision with an intelligent agent, significantly improving sample efficiency in robotic reinforcement learning and enabling scalable, labor-free robot training.
Contribution
AGPS automates robot training by replacing human supervisors with a multimodal agent that guides exploration, enhancing scalability and efficiency in reinforcement learning.
Findings
AGPS outperforms human-in-the-loop methods in sample efficiency.
The approach enables scalable, labor-free robot learning.
Validated on tasks from precision insertion to deformable object manipulation.
Abstract
Reinforcement Learning (RL) offers a powerful paradigm for autonomous robots to master generalist manipulation skills through trial-and-error. However, its real-world application is stifled by low sample efficiency. Recent Human-in-the-Loop (HIL) methods accelerate training by using human corrections, yet this approach faces a scalability barrier. Reliance on human supervisors imposes a 1:1 supervision ratio that limits scalability, suffers from operator fatigue over extended sessions, and introduces high variance due to inconsistent human proficiency. We present Agent-guided Policy Search (AGPS), a framework that automates the training pipeline by replacing human supervisors with a multimodal agent. Our key insight is that the agent can be viewed as a semantic world model, injecting intrinsic value priors to structure physical exploration. By using tools, the agent provides precise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Social Robot Interaction and HRI
