Rethinking Evaluation Methodology for Audio-to-Score Alignment
John Thickstun, Jennifer Brennan, Harsh Verma

TL;DR
This paper provides a formal definition of audio-to-score alignment, introduces new evaluation metrics, and analyzes their effectiveness on classical algorithms using a specialized dataset.
Contribution
It offers a precise formalization of audio-to-score alignment and proposes novel evaluation metrics to improve assessment accuracy.
Findings
New metrics reveal different algorithm performances
Classical algorithms show varied results with new metrics
Formal definition enhances alignment evaluation clarity
Abstract
This paper offers a precise, formal definition of an audio-to-score alignment. While the concept of an alignment is intuitively grasped, this precision affords us new insight into the evaluation of audio-to-score alignment algorithms. Motivated by these insights, we introduce new evaluation metrics for audio-to-score alignment. Using an alignment evaluation dataset derived from pairs of KernScores and MAESTRO performances, we study the behavior of our new metrics and the standard metrics on several classical alignment algorithms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing
