Conditioning Matters: Training Diffusion Policies is Faster Than You Think
Zibin Dong, Yicheng Liu, Yinchuan Li, Hang Zhao, Jianye Hao

TL;DR
This paper introduces Cocos, a simple method to improve the training efficiency of diffusion policies for vision-language-action models by preventing loss collapse through condition-dependent source distributions, leading to faster convergence and better performance.
Contribution
The paper proposes Cocos, a novel approach that modifies source distributions in diffusion policy training to enhance condition integration and training efficiency.
Findings
Cocos accelerates diffusion policy training convergence.
Cocos achieves higher success rates with fewer parameters.
The method is compatible with various policy architectures.
Abstract
Diffusion policies have emerged as a mainstream paradigm for building vision-language-action (VLA) models. Although they demonstrate strong robot control capabilities, their training efficiency remains suboptimal. In this work, we identify a fundamental challenge in conditional diffusion policy training: when generative conditions are hard to distinguish, the training objective degenerates into modeling the marginal action distribution, a phenomenon we term loss collapse. To overcome this, we propose Cocos, a simple yet general solution that modifies the source distribution in the conditional flow matching to be condition-dependent. By anchoring the source distribution around semantics extracted from condition inputs, Cocos encourages stronger condition integration and prevents the loss collapse. We provide theoretical justification and extensive empirical results across simulation and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Robot Manipulation and Learning
MethodsDiffusion
