SSM-Net: feature learning for Music Structure Analysis using a   Self-Similarity-Matrix based loss

Geoffroy Peeters; Florian Angulo

arXiv:2211.08141·cs.SD·November 16, 2022

SSM-Net: feature learning for Music Structure Analysis using a Self-Similarity-Matrix based loss

Geoffroy Peeters, Florian Angulo

PDF

Open Access

TL;DR

This paper introduces SSM-Net, a deep learning approach that learns audio features for Music Structure Analysis by aligning the self-similarity matrices of learned features with ground-truth matrices, improving analysis accuracy.

Contribution

The paper presents a novel training paradigm for audio feature learning using a differentiable SSM-based loss, enabling more effective music structure analysis.

Findings

01

Achieved high AUC scores on RWC-Pop dataset

02

Demonstrated the effectiveness of SSM-based loss in feature learning

03

Showed improved music structure analysis performance

Abstract

In this paper, we propose a new paradigm to learn audio features for Music Structure Analysis (MSA). We train a deep encoder to learn features such that the Self-Similarity-Matrix (SSM) resulting from those approximates a ground-truth SSM. This is done by minimizing a loss between both SSMs. Since this loss is differentiable w.r.t. its input features we can train the encoder in a straightforward way. We successfully demonstrate the use of this training paradigm using the Area Under the Curve ROC (AUC) on the RWC-Pop dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing