DoublyAware: Dual Planning and Policy Awareness for Temporal Difference Learning in Humanoid Locomotion

Khang Nguyen; An T. Le; Jan Peters; Minh Nhat Vu

arXiv:2506.12095·cs.RO·June 17, 2025

DoublyAware: Dual Planning and Policy Awareness for Temporal Difference Learning in Humanoid Locomotion

Khang Nguyen, An T. Le, Jan Peters, Minh Nhat Vu

PDF

Open Access

TL;DR

DoublyAware introduces a dual uncertainty decomposition in TD-MPC for humanoid locomotion, improving robustness and sample efficiency by explicitly modeling planning and policy uncertainties with conformal prediction and structured priors.

Contribution

It proposes a novel dual uncertainty-aware extension of TD-MPC that explicitly separates and manages planning and policy uncertainties in humanoid robot learning.

Findings

01

Enhanced sample efficiency and faster convergence.

02

Improved motion feasibility in complex locomotion tasks.

03

Robust decision-making under environmental stochasticity.

Abstract

Achieving robust robot learning for humanoid locomotion is a fundamental challenge in model-based reinforcement learning (MBRL), where environmental stochasticity and randomness can hinder efficient exploration and learning stability. The environmental, so-called aleatoric, uncertainty can be amplified in high-dimensional action spaces with complex contact dynamics, and further entangled with epistemic uncertainty in the models during learning phases. In this work, we propose DoublyAware, an uncertainty-aware extension of Temporal Difference Model Predictive Control (TD-MPC) that explicitly decomposes uncertainty into two disjoint interpretable components, i.e., planning and policy uncertainties. To handle the planning uncertainty, DoublyAware employs conformal prediction to filter candidate trajectories using quantile-calibrated risk bounds, ensuring statistical consistency and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Reinforcement Learning in Robotics