MixAssist: An Audio-Language Dataset for Co-Creative AI Assistance in Music Mixing
Michael Clemens, Ana Marasovi\'c

TL;DR
MixAssist introduces a novel audio-language dataset capturing real-world collaborative music mixing dialogues, enabling AI models to better understand and assist in co-creative music production processes.
Contribution
The paper presents MixAssist, the first dataset of multi-turn audio-language dialogues in music mixing, facilitating the development of AI assistants that support collaborative and instructional music production.
Findings
Fine-tuned models like Qwen-Audio outperform others in generating relevant mixing advice.
Automated and human evaluations show promising results for AI assistance in music mixing.
MixAssist enables training models to understand complex, real-world music production dialogues.
Abstract
While AI presents significant potential for enhancing music mixing and mastering workflows, current research predominantly emphasizes end-to-end automation or generation, often overlooking the collaborative and instructional dimensions vital for co-creative processes. This gap leaves artists, particularly amateurs seeking to develop expertise, underserved. To bridge this, we introduce MixAssist, a novel audio-language dataset capturing the situated, multi-turn dialogue between expert and amateur music producers during collaborative mixing sessions. Comprising 431 audio-grounded conversational turns derived from 7 in-depth sessions involving 12 producers, MixAssist provides a unique resource for training and evaluating audio-language models that can comprehend and respond to the complexities of real-world music production dialogues. Our evaluations, including automated LLM-as-a-judge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Music and Audio Processing · AI in Service Interactions
