Fast-dVLA: Accelerating Discrete Diffusion VLA to Real-Time Performance

Wenxuan Song; Jiayi Chen; Shuai Chen; Jingbo Wang; Pengxiang Ding; Han Zhao; Yikai Qin; Xinhu Zheng; Donglin Wang; Yan Wang; and Haoang Li

arXiv:2603.25661·cs.RO·April 8, 2026

Fast-dVLA: Accelerating Discrete Diffusion VLA to Real-Time Performance

Wenxuan Song, Jiayi Chen, Shuai Chen, Jingbo Wang, Pengxiang Ding, Han Zhao, Yikai Qin, Xinhu Zheng, Donglin Wang, Yan Wang, and Haoang Li

PDF

1 Repo

TL;DR

This paper introduces Fast-dVLA, a method that accelerates diffusion-based robot learning models to achieve near real-time performance by decoupling auxiliary training objectives and merging capability vectors.

Contribution

It proposes a novel parameter decoupling approach that enhances model capabilities efficiently, reducing computational costs during finetuning.

Findings

01

Achieves comparable performance to auxiliary finetuning with less computation.

02

Effectively improves robot task performance across diverse tasks.

03

Utilizes a lightweight regularization to enhance model capabilities.

Abstract

This paper proposes a novel approach to address the challenge that pretrained VLA models often fail to effectively improve performance and reduce adaptation costs during standard supervised finetuning (SFT). Some advanced finetuning methods with auxiliary training objectives can improve performance and reduce the number of convergence steps. However, they typically incur significant computational overhead due to the additional losses from auxiliary tasks. To simultaneously achieve the enhanced capabilities of auxiliary training with the simplicity of standard SFT, we decouple the two objectives of auxiliary task training within the parameter space, namely, enhancing general capabilities and fitting task-specific action distributions. To deliver this goal, we only need to train the model to converge on a small-scale task set using two distinct training strategies. The difference between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://chris1220313648.github.io/Fast-dVLA
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.