Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches

Yutong Hu; Pinhao Song; Kehan Wen; Renaud Detry

arXiv:2505.09430·cs.RO·June 6, 2025

Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches

Yutong Hu, Pinhao Song, Kehan Wen, Renaud Detry

PDF

Open Access 1 Repo 1 Models

TL;DR

Mini Diffuser introduces a two-level mini-batching technique for multi-task diffusion policies, significantly reducing training time and memory while maintaining high performance in robotic vision-language tasks.

Contribution

The paper proposes a novel two-level minibatching approach and architectural modifications to diffusion transformers, enabling efficient multi-task diffusion policy training.

Findings

01

Achieves 95% of state-of-the-art performance in RLBench simulations.

02

Uses only 5% of the training time compared to previous methods.

03

Requires just 7% of the memory of existing diffusion policy models.

Abstract

We present a method that reduces, by an order of magnitude, the time and memory needed to train multi-task vision-language robotic diffusion policies. This improvement arises from a previously underexplored distinction between action diffusion and the image diffusion techniques that inspired it: In image generation, the target is high-dimensional. By contrast, in action generation, the dimensionality of the target is comparatively small, and only the image condition is high-dimensional. Our approach, \emph{Mini Diffuser}, exploits this asymmetry by introducing \emph{two-level minibatching}, which pairs multiple noised action samples with each vision-language condition, instead of the conventional one-to-one sampling strategy. To support this batching scheme, we introduce architectural adaptations to the diffusion transformer that prevent information leakage across samples while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

utomm/mini-diffuse-actor
pytorchOfficial

Models

🤗
you2who/mini-diffusion-pusht
model· 32 dl
32 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning

MethodsDiffusion