ManiCM: Real-time 3D Diffusion Policy via Consistency Model for Robotic Manipulation

Guanxing Lu; Zifeng Gao; Tianxing Chen; Wenxun Dai; Ziwei Wang; Wenbo Ding; Yansong Tang

arXiv:2406.01586·cs.RO·September 9, 2025

ManiCM: Real-time 3D Diffusion Policy via Consistency Model for Robotic Manipulation

Guanxing Lu, Zifeng Gao, Tianxing Chen, Wenxun Dai, Ziwei Wang, Wenbo Ding, Yansong Tang

PDF

Open Access

TL;DR

ManiCM introduces a one-step diffusion-based model for real-time 3D robotic manipulation, significantly speeding up inference while maintaining high success rates across diverse tasks.

Contribution

This work presents ManiCM, a novel consistency model that enables single-step inference in diffusion-based robotic manipulation, addressing runtime inefficiency in high-dimensional tasks.

Findings

01

Accelerates inference speed by 10 times on average.

02

Maintains competitive success rates across 31 manipulation tasks.

03

Demonstrates effectiveness of consistency distillation in action prediction.

Abstract

Diffusion models have been verified to be effective in generating complex distributions from natural images to motion trajectories. Recent diffusion-based methods show impressive performance in 3D robotic manipulation tasks, whereas they suffer from severe runtime inefficiency due to multiple denoising steps, especially with high-dimensional observations. To this end, we propose a real-time robotic manipulation model named ManiCM that imposes the consistency constraint to the diffusion process, so that the model can generate robot actions in only one-step inference. Specifically, we formulate a consistent diffusion process in the robot action space conditioned on the point cloud input, where the original action is required to be directly denoised from any point along the ODE trajectory. To model this process, we design a consistency distillation technique to predict the action sample…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Manufacturing Process and Optimization · Additive Manufacturing and 3D Printing Technologies

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Diffusion