Live Music Diffusion Models: Efficient Fine-Tuning and Post-Training of Interactive Diffusion Music Generators

Zachary Novack; Stephen Brade; Haven Kim; Hugo Flores Garc\'ia; Nithya Shikarpur; Chinmay Talegaonkar; Suwan Kim; Valerie K. Chen; Julian McAuley; Taylor Berg-Kirkpatrick; Cheng-Zhi Anna Huang

arXiv:2605.22717·cs.SD·May 22, 2026

Live Music Diffusion Models: Efficient Fine-Tuning and Post-Training of Interactive Diffusion Music Generators

Zachary Novack, Stephen Brade, Haven Kim, Hugo Flores Garc\'ia, Nithya Shikarpur, Chinmay Talegaonkar, Suwan Kim, Valerie K. Chen, Julian McAuley, Taylor Berg-Kirkpatrick, Cheng-Zhi Anna Huang

PDF

1 Repo

TL;DR

This paper introduces Live Music Diffusion Models (LMDMs), an efficient approach for real-time, interactive music generation on consumer hardware, outperforming traditional models in inference complexity and enabling creative live applications.

Contribution

The authors propose LMDMs with block-wise KV Caching and ARC-Forcing, improving inference efficiency and enabling stable post-training alignment for interactive diffusion music generation.

Findings

01

LMDMs outperform discrete-AR models in inference efficiency.

02

LMDMs enable stable post-training alignment without RL.

03

LMDMs support diverse creative applications including live performance.

Abstract

Interactive streaming music generation promises the use of generative models for live performance and co-creation that is impossible with offline models. However, SOTA models exist in the discrete-AR regime, requiring industrial levels of compute for both training and inference. In this work, we investigate whether audio diffusion models, with their wide support in the open-source community but non-streaming bidirectional nature, can be repurposed efficiently into interactive models accessible on consumer hardware. By taking a critical look at the modern pipeline for block-wise outpainting diffusion, we identify critical inefficiencies during inference that result in strictly worse computational efficiency than their discrete-AR counterparts. We propose Live Music Diffusion Models (LMDMs), a simple modification of the generative diffusion process that recovers, and then outperforms, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zacharynovack/live-music-diffusion-models
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.