One Step Is Enough: Dispersive MeanFlow Policy Optimization

Guowei Zou; Haitao Wang; Hejun Wu; Yukun Qian; Yuhang Wang; Weibing Li

arXiv:2601.20701·cs.RO·January 29, 2026

One Step Is Enough: Dispersive MeanFlow Policy Optimization

Guowei Zou, Haitao Wang, Hejun Wu, Yukun Qian, Yuhang Wang, Weibing Li

PDF

Open Access 1 Models 1 Datasets

TL;DR

DMPO introduces a novel one-step policy optimization framework that achieves real-time robotic control with high-speed inference, surpassing multi-step methods in efficiency and practical deployment.

Contribution

The paper presents Dispersive MeanFlow Policy Optimization, a unified approach enabling true one-step inference in generative policies without knowledge distillation, enhancing real-time robotic control.

Findings

01

Achieves >120Hz inference speed, 5-20x faster than multi-step baselines.

02

Demonstrates superior or comparable performance on RoboMimic and OpenAI Gym benchmarks.

03

Successfully deploys on a real robot, validating real-world applicability.

Abstract

Real-time robotic control demands fast action generation. However, existing generative policies based on diffusion and flow matching require multi-step sampling, fundamentally limiting deployment in time-critical scenarios. We propose Dispersive MeanFlow Policy Optimization (DMPO), a unified framework that enables true one-step generation through three key components: MeanFlow for mathematically-derived single-step inference without knowledge distillation, dispersive regularization to prevent representation collapse, and reinforcement learning (RL) fine-tuning to surpass expert demonstrations. Experiments across RoboMimic manipulation and OpenAI Gym locomotion benchmarks demonstrate competitive or superior performance compared to multi-step baselines. With our lightweight model architecture and the three key algorithmic components working in synergy, DMPO exceeds real-time…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
Guowei-Zou/DMPO-checkpoints
model

Datasets

Guowei-Zou/DMPO-datasets
dataset· 154 dl
154 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Path Planning Algorithms