Hybrid Consistency Policy: Decoupling Multi-Modal Diversity and Real-Time Efficiency in Robotic Manipulation

Qianyou Zhao; Yuliang Shen; Xuanran Zhai; Ce Hao; Duidi Wu; Jin Qi; Jie Hu; Qiaojun Yu

arXiv:2510.26670·cs.RO·October 31, 2025

Hybrid Consistency Policy: Decoupling Multi-Modal Diversity and Real-Time Efficiency in Robotic Manipulation

Qianyou Zhao, Yuliang Shen, Xuanran Zhai, Ce Hao, Duidi Wu, Jin Qi, Jie Hu, Qiaojun Yu

PDF

TL;DR

The paper introduces the Hybrid Consistency Policy (HCP), a method that achieves fast, multi-modal robotic manipulation by decoupling diversity and efficiency through a stochastic prefix and a one-step consistency jump.

Contribution

HCP is a novel approach that combines stochastic prefix sampling with a one-step consistency jump, enabling fast, multi-modal policy inference in robotics.

Findings

01

HCP with 25 SDE steps plus one jump matches 80-step DDPM accuracy.

02

HCP significantly reduces latency compared to traditional diffusion methods.

03

Multi-modality can be achieved without slow inference in robotic policies.

Abstract

In visuomotor policy learning, diffusion-based imitation learning has become widely adopted for its ability to capture diverse behaviors. However, approaches built on ordinary and stochastic denoising processes struggle to jointly achieve fast sampling and strong multi-modality. To address these challenges, we propose the Hybrid Consistency Policy (HCP). HCP runs a short stochastic prefix up to an adaptive switch time, and then applies a one-step consistency jump to produce the final action. To align this one-jump generation, HCP performs time-varying consistency distillation that combines a trajectory-consistency objective to keep neighboring predictions coherent and a denoising-matching objective to improve local fidelity. In both simulation and on a real robot, HCP with 25 SDE steps plus one jump approaches the 80-step DDPM teacher in accuracy and mode coverage while significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.