PhysMoDPO: Physically-Plausible Humanoid Motion with Preference Optimization

Yangsong Zhang; Anujith Muraleedharan; Rikhat Akizhanov; Abdul Ahad Butt; G\"ul Varol; Pascal Fua; Fabio Pizzati; Ivan Laptev

arXiv:2603.13228·cs.LG·March 17, 2026

PhysMoDPO: Physically-Plausible Humanoid Motion with Preference Optimization

Yangsong Zhang, Anujith Muraleedharan, Rikhat Akizhanov, Abdul Ahad Butt, G\"ul Varol, Pascal Fua, Fabio Pizzati, Ivan Laptev

PDF

Open Access

TL;DR

PhysMoDPO introduces a novel training framework that optimizes diffusion-based human motion generation to produce physically plausible and instruction-compliant motions, improving realism and transferability to robots.

Contribution

It integrates Whole-Body Controller into training and uses preference optimization with physics-based rewards, advancing the realism and applicability of text-conditioned motion models.

Findings

01

Enhanced physical realism in generated motions.

02

Improved task accuracy in simulated robot control.

03

Successful zero-shot transfer to real humanoid robot.

Abstract

Recent progress in text-conditioned human motion generation has been largely driven by diffusion models trained on large-scale human motion data. Building on this progress, recent methods attempt to transfer such models for character animation and real robot control by applying a Whole-Body Controller (WBC) that converts diffusion-generated motions into executable trajectories. While WBC trajectories become compliant with physics, they may expose substantial deviations from original motion. To address this issue, we here propose PhysMoDPO, a Direct Preference Optimization framework. Unlike prior work that relies on hand-crafted physics-aware heuristics such as foot-sliding penalties, we integrate WBC into our training pipeline and optimize diffusion model such that the output of WBC becomes compliant both with physics and original text instructions. To train PhysMoDPO we deploy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Robot Manipulation and Learning · 3D Shape Modeling and Analysis