GPO: Growing Policy Optimization for Legged Robot Locomotion and Whole-Body Control

Shuhao Liao; Peizhuo Li; Xinrong Yang; Linnan Chang; Zhaoxin Fan; Qing Wang; Lei Shi; Yuhong Cao; Wenjun Wu; Guillaume Sartoretti

arXiv:2601.20668·cs.RO·January 29, 2026

GPO: Growing Policy Optimization for Legged Robot Locomotion and Whole-Body Control

Shuhao Liao, Peizhuo Li, Xinrong Yang, Linnan Chang, Zhaoxin Fan, Qing Wang, Lei Shi, Yuhong Cao, Wenjun Wu, Guillaume Sartoretti

PDF

Open Access

TL;DR

GPO is a novel reinforcement learning framework that progressively expands the action space during training, leading to more effective policy learning and better real-world performance in legged robot locomotion.

Contribution

GPO introduces a time-varying action transformation that preserves PPO stability and enhances exploration, improving policy training for torque-based legged robot control.

Findings

01

GPO-trained policies outperform baselines in simulation and hardware.

02

Zero-shot transfer of policies from simulation to real robots is successful.

03

GPO provides a general framework applicable across different legged robot platforms.

Abstract

Training reinforcement learning (RL) policies for legged robots remains challenging due to high-dimensional continuous actions, hardware constraints, and limited exploration. Existing methods for locomotion and whole-body control work well for position-based control with environment-specific heuristics (e.g., reward shaping, curriculum design, and manual initialization), but are less effective for torque-based control, where sufficiently exploring the action space and obtaining informative gradient signals for training is significantly more difficult. We introduce Growing Policy Optimization (GPO), a training framework that applies a time-varying action transformation to restrict the effective action space in the early stage, thereby encouraging more effective data collection and policy learning, and then progressively expands it to enhance exploration and achieve higher expected…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Locomotion and Control · Reinforcement Learning in Robotics · Prosthetics and Rehabilitation Robotics