PKSpell: Data-Driven Pitch Spelling and Key Signature Estimation

Francesco Foscarin (CNAM); Nicolas Audebert; Rapha\"el; Fournier-S'Niehotta

arXiv:2107.14009·cs.SD·July 30, 2021·1 cites

PKSpell: Data-Driven Pitch Spelling and Key Signature Estimation

Francesco Foscarin (CNAM), Nicolas Audebert, Rapha\"el, Fournier-S'Niehotta

PDF

Open Access 1 Repo

TL;DR

PKSpell is a data-driven deep learning system that jointly estimates pitch spelling and key signatures from MIDI files, improving accuracy and facilitating various music information retrieval tasks.

Contribution

It introduces a neural network model for joint pitch and key estimation from MIDI data, with a novel data augmentation method and state-of-the-art results on multiple datasets.

Findings

01

Achieves high accuracy in key signature estimation.

02

Sets new state-of-the-art in pitch spelling on MuseData.

03

Effective with limited datasets due to data augmentation.

Abstract

We present PKSpell: a data-driven approach for the joint estimation of pitch spelling and key signatures from MIDI files. Both elements are fundamental for the production of a full-fledged musical score and facilitate many MIR tasks such as harmonic analysis, section identification, melodic similarity, and search in a digital music library. We design a deep recurrent neural network model that only requires information readily available in all kinds of MIDI files, including performances, or other symbolic encodings. We release a model trained on the ASAP dataset. Our system can be used with these pre-trained parameters and is easy to integrate into a MIR pipeline. We also propose a data augmentation procedure that helps retraining on small datasets. PKSpell achieves strong key signature estimation performance on a challenging dataset. Most importantly, this model establishes a new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fosfrancesco/pkspell
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Neuroscience and Music Perception