MUST-RAG: MUSical Text Question Answering with Retrieval Augmented Generation

Daeyong Kwon; SeungHeon Doh; Juhan Nam

arXiv:2507.23334·cs.CL·December 9, 2025

MUST-RAG: MUSical Text Question Answering with Retrieval Augmented Generation

Daeyong Kwon, SeungHeon Doh, Juhan Nam

PDF

Open Access

TL;DR

MusT-RAG enhances large language models for music question answering by integrating a music-specific retrieval system, significantly improving domain adaptation and outperforming traditional fine-tuning methods.

Contribution

The paper introduces MusT-RAG, a novel retrieval-augmented framework with a specialized music database, improving LLMs' performance in music question answering tasks.

Findings

01

MusT-RAG outperforms fine-tuning in music domain adaptation.

02

MusWikiDB is more effective than Wikipedia for music retrieval.

03

Significant improvements on both in-domain and out-of-domain benchmarks.

Abstract

Recent advancements in Large language models (LLMs) have demonstrated remarkable capabilities across diverse domains. While they exhibit strong zero-shot performance on various tasks, LLMs' effectiveness in music-related applications remains limited due to the relatively small proportion of music-specific knowledge in their training data. To address this limitation, we propose MusT-RAG, a comprehensive framework based on Retrieval Augmented Generation (RAG) to adapt general-purpose LLMs for text-only music question answering (MQA) tasks. RAG is a technique that provides external knowledge to LLMs by retrieving relevant context information when generating answers to questions. To optimize RAG for the music domain, we (1) propose MusWikiDB, a music-specialized vector database for the retrieval stage, and (2) utilizes context information during both inference and fine-tuning processes to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications