TD-GRPC: Temporal Difference Learning with Group Relative Policy Constraint for Humanoid Locomotion
Khang Nguyen, Khai Nguyen, An T. Le, Jan Peters, Manfred Huber, Ngo Anh Vien, and Minh Nhat Vu

TL;DR
TD-GRPC enhances humanoid robot locomotion learning by integrating group-relative policy constraints with trust-region methods, improving stability, robustness, and efficiency in complex control tasks.
Contribution
Introduces TD-GRPC, a novel extension of TD-MPC that combines group-relative policy optimization with explicit policy constraints for improved humanoid locomotion.
Findings
Demonstrates improved stability and robustness in humanoid control tasks.
Achieves higher sampling efficiency during training.
Successfully handles complex, dynamic movements on a 26-DoF robot.
Abstract
Robot learning in high-dimensional control settings, such as humanoid locomotion, presents persistent challenges for reinforcement learning (RL) algorithms due to unstable dynamics, complex contact interactions, and sensitivity to distributional shifts during training. Model-based methods, \textit{e.g.}, Temporal-Difference Model Predictive Control (TD-MPC), have demonstrated promising results by combining short-horizon planning with value-based learning, enabling efficient solutions for basic locomotion tasks. However, these approaches remain ineffective in addressing policy mismatch and instability introduced by off-policy updates. Thus, in this work, we introduce Temporal-Difference Group Relative Policy Constraint (TD-GRPC), an extension of the TD-MPC framework that unifies Group Relative Policy Optimization (GRPO) with explicit Policy Constraints (PC). TD-GRPC applies a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation
