STONE: Self-supervised Tonality Estimator

Yuexuan Kong; Vincent Lostanlen; Gabriel Meseguer-Brocal; Stella Wong,; Mathieu Lagrange; Romain Hennequin

arXiv:2407.07408·cs.SD·April 2, 2025

STONE: Self-supervised Tonality Estimator

Yuexuan Kong, Vincent Lostanlen, Gabriel Meseguer-Brocal, Stella Wong,, Mathieu Lagrange, Romain Hennequin

PDF

1 Repo

TL;DR

STONE introduces a self-supervised approach for musical key estimation using a convolutional neural network trained on pitch transpositions, reducing annotation effort while achieving high accuracy.

Contribution

It is the first self-supervised method for tonality estimation, leveraging pitch transposition regression to learn key signatures without extensive labeled data.

Findings

01

Semi-TONE matches supervised accuracy with less supervision

02

Self-supervised training correlates KSP with tonal key signature

03

Method outperforms supervised models with equal supervision

Abstract

Although deep neural networks can estimate the key of a musical piece, their supervision incurs a massive annotation effort. Against this shortcoming, we present STONE, the first self-supervised tonality estimator. The architecture behind STONE, named ChromaNet, is a convnet with octave equivalence which outputs a key signature profile (KSP) of 12 structured logits. First, we train ChromaNet to regress artificial pitch transpositions between any two unlabeled musical excerpts from the same audio track, as measured as cross-power spectral density (CPSD) within the circle of fifths (CoF). We observe that this self-supervised pretext task leads KSP to correlate with tonal key signature. Based on this observation, we extend STONE to output a structured KSP of 24 logits, and introduce supervision so as to disambiguate major versus minor keys sharing the same key signature. Applying different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

deezer/stone
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.