Predicting User Intents and Musical Attributes from Music Discovery Conversations
Daeyong Kwon, SeungHeon Doh, Juhan Nam

TL;DR
This paper explores intent and musical attribute classification in music discovery conversations using pre-trained language models, introducing context-aware input methods that enhance classification accuracy over baseline models.
Contribution
It introduces a novel approach combining chat history with user queries for better context understanding in intent and musical attribute classification.
Findings
Significant improvement in F1 scores for intent and musical attribute classification.
Outperforms zero-shot and few-shot capabilities of Llama 3 model.
Demonstrates the effectiveness of context-aware input in conversational music understanding.
Abstract
Intent classification is a text understanding task that identifies user needs from input text queries. While intent classification has been extensively studied in various domains, it has not received much attention in the music domain. In this paper, we investigate intent classification models for music discovery conversation, focusing on pre-trained language models. Rather than only predicting functional needs: intent classification, we also include a task for classifying musical needs: musical attribute classification. Additionally, we propose a method of concatenating previous chat history with just single-turn user queries in the input text, allowing the model to understand the overall conversation context better. Our proposed model significantly improves the F1 score for both user intent and musical attribute classification, and surpasses the zero-shot and few-shot performance of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Advanced Text Analysis Techniques
MethodsSoftmax · Attention Is All You Need · LLaMA
