One Step Is Enough: Dispersive MeanFlow Policy Optimization
Guowei Zou, Haitao Wang, Hejun Wu, Yukun Qian, Yuhang Wang, Weibing Li

TL;DR
DMPO introduces a novel one-step policy optimization framework that achieves real-time robotic control with high-speed inference, surpassing multi-step methods in efficiency and practical deployment.
Contribution
The paper presents Dispersive MeanFlow Policy Optimization, a unified approach enabling true one-step inference in generative policies without knowledge distillation, enhancing real-time robotic control.
Findings
Achieves >120Hz inference speed, 5-20x faster than multi-step baselines.
Demonstrates superior or comparable performance on RoboMimic and OpenAI Gym benchmarks.
Successfully deploys on a real robot, validating real-world applicability.
Abstract
Real-time robotic control demands fast action generation. However, existing generative policies based on diffusion and flow matching require multi-step sampling, fundamentally limiting deployment in time-critical scenarios. We propose Dispersive MeanFlow Policy Optimization (DMPO), a unified framework that enables true one-step generation through three key components: MeanFlow for mathematically-derived single-step inference without knowledge distillation, dispersive regularization to prevent representation collapse, and reinforcement learning (RL) fine-tuning to surpass expert demonstrations. Experiments across RoboMimic manipulation and OpenAI Gym locomotion benchmarks demonstrate competitive or superior performance compared to multi-step baselines. With our lightweight model architecture and the three key algorithmic components working in synergy, DMPO exceeds real-time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Path Planning Algorithms
