Creating A Multi-track Classical Musical Performance Dataset for Multimodal Music Analysis: Challenges, Insights, and Applications
Bochen Li, Xinzhao Liu, Karthik Dinesh, Zhiyao Duan, Gaurav Sharma

TL;DR
This paper introduces a high-quality, synchronized multi-track classical music dataset with comprehensive annotations, designed to advance multimodal music analysis and support both existing and novel MIR tasks.
Contribution
The paper presents a new multi-track classical music dataset with synchronized audio-visual recordings, annotations, and benchmarks, addressing challenges in maintaining synchronization and expressiveness.
Findings
High synchronization quality demonstrated compared to existing datasets
Benchmark results for multi-pitch analysis and score-informed source separation
Baseline systems for novel multimodal tasks like visually informed pitch analysis
Abstract
We introduce a dataset for facilitating audio-visual analysis of music performances. The dataset comprises 44 simple multi-instrument classical music pieces assembled from coordinated but separately recorded performances of individual tracks. For each piece, we provide the musical score in MIDI format, the audio recordings of the individual tracks, the audio and video recording of the assembled mixture, and ground-truth annotation files including frame-level and note-level transcriptions. We describe our methodology for the creation of the dataset, particularly highlighting our approaches for addressing the challenges involved in maintaining synchronization and expressiveness. We demonstrate the high quality of synchronization achieved with our proposed approach by comparing the dataset with existing widely-used music audio datasets. We anticipate that the dataset will be useful for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
