Music Boomerang: Reusing Diffusion Models for Data Augmentation and Audio Manipulation

Alexander Fichtinger; Jan Schl\"uter; Gerhard Widmer

arXiv:2507.04864·cs.SD·July 8, 2025

Music Boomerang: Reusing Diffusion Models for Data Augmentation and Audio Manipulation

Alexander Fichtinger, Jan Schl\"uter, Gerhard Widmer

PDF

TL;DR

This paper adapts Boomerang sampling, a diffusion model technique, for audio to enhance data augmentation and content manipulation, improving beat tracking and enabling instrument replacement.

Contribution

It introduces Boomerang sampling for audio diffusion models, demonstrating its effectiveness in data augmentation and instrument replacement tasks.

Findings

01

Improves beat tracker performance with limited training data

02

Preserves rhythmic structure in audio manipulation

03

Enables text-based instrument replacement

Abstract

Generative models of music audio are typically used to generate output based solely on a text prompt or melody. Boomerang sampling, recently proposed for the image domain, allows generating output close to an existing example, using any pretrained diffusion model. In this work, we explore its application in the audio domain as a tool for data augmentation or content manipulation. Specifically, implementing Boomerang sampling for Stable Audio Open, we augment training data for a state-of-the-art beat tracker, and attempt to replace musical instruments in recordings. Our results show that the rhythmic structure of existing examples is mostly preserved, that it improves performance of the beat tracker, but only in scenarios of limited training data, and that it can accomplish text-based instrument replacement on monophonic inputs. We publish our implementation to invite experiments on data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.