Cluster and Separate: a GNN Approach to Voice and Staff Prediction for   Score Engraving

Francesco Foscarin; Emmanouil Karystinaios; Eita Nakamura; Gerhard; Widmer

arXiv:2407.21030·eess.AS·August 1, 2024

Cluster and Separate: a GNN Approach to Voice and Staff Prediction for Score Engraving

Francesco Foscarin, Emmanouil Karystinaios, Eita Nakamura, Gerhard, Widmer

PDF

Open Access 1 Repo

TL;DR

This paper introduces a graph neural network-based method for separating notes into voices and staves in symbolic piano music, improving score readability and supporting visualization and export functionalities.

Contribution

It presents a novel end-to-end GNN approach for voice and staff separation in piano scores, addressing challenging cross-staff and homophonic voice tasks.

Findings

01

Outperforms previous methods on multiple datasets

02

Supports visualization and export of separated voices

03

Effective for homophonic and cross-staff voice separation

Abstract

This paper approaches the problem of separating the notes from a quantized symbolic music piece (e.g., a MIDI file) into multiple voices and staves. This is a fundamental part of the larger task of music score engraving (or score typesetting), which aims to produce readable musical scores for human performers. We focus on piano music and support homophonic voices, i.e., voices that can contain chords, and cross-staff voices, which are notably difficult tasks that have often been overlooked in previous research. We propose an end-to-end system based on graph neural networks that clusters notes that belong to the same chord and connects them with edges if they are part of a voice. Our results show clear and consistent improvements over a previous approach on two datasets of different styles. To aid the qualitative analysis of our results, we support the export in symbolic music formats…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cpjku/piano_svsep
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis

MethodsFocus