Robot Trains Robot: Automatic Real-World Policy Adaptation and Learning for Humanoids

Kaizhe Hu; Haochen Shi; Yao He; Weizhuo Wang; C. Karen Liu; Shuran Song

arXiv:2508.12252·cs.RO·August 27, 2025

Robot Trains Robot: Automatic Real-World Policy Adaptation and Learning for Humanoids

Kaizhe Hu, Haochen Shi, Yao He, Weizhuo Wang, C. Karen Liu, Shuran Song

PDF

Open Access

TL;DR

This paper introduces RTR, a framework enabling efficient real-world reinforcement learning for humanoid robots through active guidance by a robotic arm teacher, addressing sim-to-real transfer challenges and reducing human intervention.

Contribution

The paper presents a novel RTR framework with a real-time RL pipeline and robotic teacher support, facilitating stable and efficient real-world humanoid policy learning.

Findings

01

Successful fine-tuning of humanoid walking policy for speed tracking

02

Learning a humanoid swing-up task from scratch in real-world

03

Demonstrated improved safety and efficiency in real-world training

Abstract

Simulation-based reinforcement learning (RL) has significantly advanced humanoid locomotion tasks, yet direct real-world RL from scratch or adapting from pretrained policies remains rare, limiting the full potential of humanoid robots. Real-world learning, despite being crucial for overcoming the sim-to-real gap, faces substantial challenges related to safety, reward design, and learning efficiency. To address these limitations, we propose Robot-Trains-Robot (RTR), a novel framework where a robotic arm teacher actively supports and guides a humanoid robot student. The RTR system provides protection, learning schedule, reward, perturbation, failure detection, and automatic resets. It enables efficient long-term real-world humanoid training with minimal human intervention. Furthermore, we propose a novel RL pipeline that facilitates and stabilizes sim-to-real transfer by optimizing a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics