Self-Similarity-Based and Novelty-based loss for music structure analysis
Geoffroy Peeters

TL;DR
This paper introduces a supervised music structure analysis method that combines self-similarity and novelty-based losses, leveraging learned features and kernels, with self-attention to improve boundary detection accuracy.
Contribution
The approach jointly optimizes self-similarity and novelty losses, incorporating self-attention for enhanced feature learning in music boundary detection.
Findings
Outperforms previous methods on RWC-Pop and SALAMI datasets.
Demonstrates the effectiveness of combined SSM and novelty losses.
Shows benefits of self-attention in music structure analysis.
Abstract
Music Structure Analysis (MSA) is the task aiming at identifying musical segments that compose a music track and possibly label them based on their similarity. In this paper we propose a supervised approach for the task of music boundary detection. In our approach we simultaneously learn features and convolution kernels. For this we jointly optimize -- a loss based on the Self-Similarity-Matrix (SSM) obtained with the learned features, denoted by SSM-loss, and -- a loss based on the novelty score obtained applying the learned kernels to the estimated SSM, denoted by novelty-loss. We also demonstrate that relative feature learning, through self-attention, is beneficial for the task of MSA. Finally, we compare the performances of our approach to previously proposed approaches on the standard RWC-Pop, and various subsets of SALAMI.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Diverse Musicological Studies · Music Technology and Sound Studies
MethodsConvolution
