CSyMR: Benchmarking Compositional Music Information Retrieval in Symbolic Music Reasoning
Boyang Wang, Yash Vishe, Xin Xu, Zachary Novack, Xunyi Jiang, Julian McAuley, Junda Wu

TL;DR
This paper introduces CSyMR-Bench, a new benchmark for compositional music information retrieval in symbolic music, highlighting the challenges for language models and demonstrating the effectiveness of tool-augmented reasoning methods.
Contribution
It presents a comprehensive benchmark with real-world questions and a tool-augmented framework that improves retrieval accuracy over traditional language models.
Findings
Tool-grounded approaches outperform LLM-only methods by 5-7% accuracy.
Benchmark includes 126 questions from real user scenarios and exams.
Analysis-heavy categories see the largest performance improvements.
Abstract
Natural language information needs over symbolic music scores rarely reduce to a single step lookup. Many queries require compositional Music Information Retrieval (MIR) that extracts multiple pieces of evidence from structured notation and aggregates them to answer the question. This setting remains challenging for Large Language Models due to the mismatch between natural language intents and symbolic representations, as well as the difficulty of reliably handling long structured contexts. Existing benchmarks only partially capture these retrieval demands, often emphasizing isolated theoretical knowledge or simplified settings. We introduce CSyMR-Bench, a benchmark for compositional MIR in symbolic music reasoning grounded in authentic user scenarios. It contains 126 multiple choice questions curated from community discussions and professional examinations, where each item requires…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Machine Learning in Materials Science · Music Technology and Sound Studies
