TL;DR
This paper introduces a perceptual measure for evaluating how well automatic music transcription resynthesis preserves artistic interpretation across different contexts, highlighting limitations of MIDI-based assessments and proposing a new correlated measure.
Contribution
It distinguishes performance from interpretation, evaluates perceptual changes due to context, and proposes a novel measure and a new score-informed AMT method.
Findings
MIDI format alone does not fully capture artistic intention.
Objective MIDI-based measures have low correlation with subjective perception.
The proposed measure is meaningfully correlated with listener evaluations.
Abstract
This study focuses on the perception of music performances when contextual factors, such as room acoustics and instrument, change. We propose to distinguish the concept of "performance" from the one of "interpretation", which expresses the "artistic intention". Towards assessing this distinction, we carried out an experimental evaluation where 91 subjects were invited to listen to various audio recordings created by resynthesizing MIDI data obtained through Automatic Music Transcription (AMT) systems and a sensorized acoustic piano. During the resynthesis, we simulated different contexts and asked listeners to evaluate how much the interpretation changes when the context changes. Results show that: (1) MIDI format alone is not able to completely grasp the artistic intention of a music performance; (2) usual objective evaluation measures based on MIDI data present low correlations with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
