LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment
Huan Zhang, Vincent Cheung, Hayato Nishioka, Simon Dixon, Shinichi, Furuya

TL;DR
LLaQo is a novel large language query-based music coach that uses audio language modeling to assess and provide detailed feedback on musical performances, focusing on expressive aspects often overlooked.
Contribution
Introduces LLaQo, a new model combining audio encoding and large language models for detailed, performance-specific music assessment and feedback.
Findings
Achieved state-of-the-art performance in predicting teacher ratings.
Successfully identified piece difficulty and playing techniques.
Received higher user ratings for textual responses in a user study.
Abstract
Research in music understanding has extensively explored composition-level attributes such as key, genre, and instrumentation through advanced representations, leading to cross-modal applications using large language models. However, aspects of musical performance such as stylistic expression and technique remain underexplored, along with the potential of using large language models to enhance educational outcomes with customized feedback. To bridge this gap, we introduce LLaQo, a Large Language Query-based music coach that leverages audio language modeling to provide detailed and formative assessments of music performances. We also introduce instruction-tuned query-response datasets that cover a variety of performance dimensions from pitch accuracy to articulation, as well as contextual performance understanding (such as difficulty and performance techniques). Utilizing AudioMAE…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Diverse Music Education Insights · Music Technology and Sound Studies
