Improving Music Source Separation with Diffusion and Consistency Refinement

Tornike Karchkhadze; Mohammad Rasool Izadi; Shuo Zhang; and Shlomo Dubnov

arXiv:2412.06965·cs.SD·April 28, 2026

Improving Music Source Separation with Diffusion and Consistency Refinement

Tornike Karchkhadze, Mohammad Rasool Izadi, Shuo Zhang, and Shlomo Dubnov

PDF

1 Repo

TL;DR

This paper introduces a diffusion-based refinement method for music source separation that improves quality and reduces inference time through consistency distillation, applicable across different models.

Contribution

It presents a novel diffusion and consistency distillation approach that enhances source separation quality and efficiency, generalizing across architectures.

Findings

01

Diffusion refinement improves separation quality.

02

Consistency distillation reduces inference to a single step.

03

Method achieves state-of-the-art results on multiple datasets.

Abstract

In this work, we propose an approach to music source separation that uses a generative diffusion model as a last-stage refinement on top of a deterministic separator, progressively enhancing the separated sources through iterative denoising. While the diffusion refinement yields measurable quality gains, it requires iterative steps at inference, increasing computational cost. To speed up the inference process, we apply consistency distillation, reducing inference to a single step while maintaining quality; with two or more steps, the distilled model even surpasses the diffusion-based approach. Crucially, our method is architecture-agnostic: we demonstrate state-of-the-art results when applied to both a custom U-Net-based separator on Slakh2100 and the state-of-the-art BS-RoFormer model on MUSDB18, showing that the refinement generalizes across backbone architectures. Sound examples are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://consistency-separation.github.io
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.