Note-Level Singing Melody Transcription for Time-Aligned Musical Score   Generation

Leekyung Kim; Sungwook Jeon; Wan Heo; Jonghun Park

arXiv:2502.12438·cs.SD·February 19, 2025

Note-Level Singing Melody Transcription for Time-Aligned Musical Score Generation

Leekyung Kim, Sungwook Jeon, Wan Heo, Jonghun Park

PDF

Open Access

TL;DR

This paper presents an end-to-end framework for note-level singing melody transcription that recognizes pitch, onset, offset, and note value, enabling accurate generation of time-aligned musical scores from audio recordings.

Contribution

It introduces a novel integrated model with tokenized representations and pseudo-labeling for note value extraction, improving transcription accuracy over existing methods.

Findings

01

Outperforms state-of-the-art in note-level transcription accuracy.

02

Introduces new metrics for evaluating temporal and note value accuracy.

03

Qualitative analysis confirms effective note value capture.

Abstract

Automatic music transcription converts audio recordings into symbolic representations, facilitating music analysis, retrieval, and generation. A musical note is characterized by pitch, onset, and offset in an audio domain, whereas it is defined in terms of pitch and note value in a musical score domain. A time-aligned score, derived from timing information along with pitch and note value, allows matching a part of the score with the corresponding part of the music audio, enabling various applications. In this paper, we consider an extended version of the traditional note-level transcription task that recognizes onset, offset, and pitch, through including extraction of additional note value to generate a time-aligned score from an audio input. To address this new challenge, we propose an end-to-end framework that integrates recognition of the note value, pitch, and temporal information.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Human Motion and Animation