LaDA-Band: Language Diffusion Models for Vocal-to-Accompaniment Generation

Qi Wang; Zhexu Shen; Meng Chen; Guoxin Yu; Chaoxu Pang; Weifeng Zhao; Wenjiang Zhou

arXiv:2604.11052·cs.SD·April 14, 2026

LaDA-Band: Language Diffusion Models for Vocal-to-Accompaniment Generation

Qi Wang, Zhexu Shen, Meng Chen, Guoxin Yu, Chaoxu Pang, Weifeng Zhao, Wenjiang Zhou

PDF

1 Repo

TL;DR

LaDA-Band introduces a novel non-autoregressive diffusion model for vocal-to-accompaniment music generation, enhancing coherence, authenticity, and orchestration in full-song outputs.

Contribution

It proposes Discrete Masked Diffusion with a dual-track architecture and curriculum training, advancing long-range, detailed, and coherent musical accompaniment generation.

Findings

01

Outperforms existing methods in acoustic authenticity and coherence

02

Maintains high-quality accompaniment without auxiliary references

03

Effective on both academic and real-world benchmarks

Abstract

Vocal-to-accompaniment (V2A) generation, which aims to transform a raw vocal recording into a fully arranged accompaniment, inherently requires jointly addressing an accompaniment trilemma: preserving acoustic authenticity, maintaining global coherence with the vocal track, and producing dynamic orchestration across a full song. Existing open-source approaches typically make compromises among these goals. Continuous-latent generation models can capture long musical spans but often struggle to preserve fine-grained acoustic detail. In contrast, discrete autoregressive models retain local fidelity but suffer from unidirectional generation and error accumulation in extended contexts. We present LaDA-Band, an end-to-end framework that introduces Discrete Masked Diffusion to the V2A task. Our approach formulates V2A generation as Discrete Masked Diffusion, i.e., a global, non-autoregressive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Duoluoluos/TME-LaDA-Band
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.