Exploring single-song autoencoding schemes for audio-based music   structure analysis

Axel Marmoret; J\'er\'emy E. Cohen; Fr\'ed\'eric Bimbot

arXiv:2110.14437·cs.SD·March 9, 2022

Exploring single-song autoencoding schemes for audio-based music structure analysis

Axel Marmoret, J\'er\'emy E. Cohen, Fr\'ed\'eric Bimbot

PDF

Open Access 1 Repo

TL;DR

This paper introduces a song-specific autoencoding approach for music structure analysis that learns from unlabeled data and achieves comparable performance to supervised methods using only a few seconds of tolerance.

Contribution

It proposes a novel unsupervised, piece-specific autoencoding scheme that does not require annotations, enabling effective music structure inference.

Findings

01

Achieves state-of-the-art performance with 3 seconds tolerance on RWC-Pop dataset.

02

Does not rely on supervision or annotations, reducing data collection effort.

03

Performs comparably to supervised methods in music structure analysis.

Abstract

The ability of deep neural networks to learn complex data relations and representations is established nowadays, but it generally relies on large sets of training data. This work explores a "piece-specific" autoencoding scheme, in which a low-dimensional autoencoder is trained to learn a latent/compressed representation specific to a given song, which can then be used to infer the song structure. Such a model does not rely on supervision nor annotations, which are well-known to be tedious to collect and often ambiguous in Music Structure Analysis. We report that the proposed unsupervised auto-encoding scheme achieves the level of performance of supervised state-of-the-art methods with 3 seconds tolerance when using a Log Mel spectrogram representation on the RWC-Pop dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://gitlab.inria.fr/amarmore/musicae
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing