AutoMashup: Automatic Music Mashups Creation

Marine Delabaere (IMT Atlantique); L\'ea Miqueu (IMT Atlantique); Michael Moreno (IMT Atlantique); Gautier Bigois (IMT Atlantique); Hoang Duong (IMT Atlantique); Ella Fernandez (IMT Atlantique); Flavie Manent (IMT Atlantique); Maria Salgado-Herrera (IMT Atlantique); Bastien Pasdeloup (Lab\_STICC\_BRAIn; IMT Atlantique - MEE; IMT Atlantique); Nicolas Farrugia (Lab\_STICC\_BRAIn; IMT Atlantique - MEE; IMT Atlantique); Axel Marmoret (Lab\_STICC\_BRAIn; IMT Atlantique - MEE; IMT Atlantique)

arXiv:2508.06516·cs.SD·August 12, 2025

AutoMashup: Automatic Music Mashups Creation

Marine Delabaere (IMT Atlantique), L\'ea Miqueu (IMT Atlantique), Michael Moreno (IMT Atlantique), Gautier Bigois (IMT Atlantique), Hoang Duong (IMT Atlantique), Ella Fernandez (IMT Atlantique), Flavie Manent (IMT Atlantique), Maria Salgado-Herrera (IMT Atlantique)

PDF

Open Access

TL;DR

AutoMashup is a system that automates music mashup creation by leveraging source separation, music analysis, and compatibility estimation, revealing limitations of current audio embeddings in capturing perceptual coherence.

Contribution

The paper introduces AutoMashup and evaluates the effectiveness of pretrained audio models for compatibility estimation in mashup creation, highlighting their limitations.

Findings

01

Mashup compatibility is asymmetric based on track roles.

02

Current embeddings do not match perceptual coherence measured by COCOLA.

03

Limitations of general-purpose audio representations are identified.

Abstract

We introduce AutoMashup, a system for automatic mashup creation based on source separation, music analysis, and compatibility estimation. We propose using COCOLA to assess compatibility between separated stems and investigate whether general-purpose pretrained audio models (CLAP and MERT) can support zero-shot estimation of track pair compatibility. Our results show that mashup compatibility is asymmetric -- it depends on the role assigned to each track (vocals or accompaniment) -- and that current embeddings fail to reproduce the perceptual coherence measured by COCOLA. These findings underline the limitations of general-purpose audio representations for compatibility estimation in mashup creation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies