CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion

Jiahua Ma; Yiran Qin; Yixiong Li; Xuanqi Liao; Yulan Guo; Ruimao Zhang

arXiv:2506.14769·cs.CV·August 12, 2025

CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion

Jiahua Ma, Yiran Qin, Yixiong Li, Xuanqi Liao, Yulan Guo, Ruimao Zhang

PDF

Open Access

TL;DR

This paper introduces Causal Diffusion Policy (CDP), a transformer-based diffusion model that improves visuomotor policy learning by conditioning on historical actions and employing caching for efficiency, demonstrating robustness in real-world robotic tasks.

Contribution

The paper presents a novel transformer-based diffusion model with a caching mechanism that leverages historical actions for robust and efficient visuomotor policy learning in robotics.

Findings

01

CDP outperforms existing methods in accuracy across diverse manipulation tasks.

02

CDP maintains high precision under degraded observation conditions.

03

The caching mechanism significantly reduces inference computation time.

Abstract

Diffusion Policy (DP) enables robots to learn complex behaviors by imitating expert demonstrations through action diffusion. However, in practical applications, hardware limitations often degrade data quality, while real-time constraints restrict model inference to instantaneous state and scene observations. These limitations seriously reduce the efficacy of learning from expert demonstrations, resulting in failures in object localization, grasp planning, and long-horizon task execution. To address these challenges, we propose Causal Diffusion Policy (CDP), a novel transformer-based diffusion model that enhances action prediction by conditioning on historical action sequences, thereby enabling more coherent and context-aware visuomotor policy learning. To further mitigate the computational cost associated with autoregressive inference, a caching mechanism is also introduced to store…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning

MethodsDiffusion