MuseChat: A Conversational Music Recommendation System for Videos

Zhikang Dong; Bin Chen; Xiulong Liu; Pawel Polak; Peng Zhang

arXiv:2310.06282·cs.LG·March 12, 2024·1 cites

MuseChat: A Conversational Music Recommendation System for Videos

Zhikang Dong, Bin Chen, Xiulong Liu, Pawel Polak, Peng Zhang

PDF

Open Access 1 Repo

TL;DR

MuseChat is a novel dialogue-based system that personalizes music recommendations for videos, incorporating user preferences and providing explanations through multi-modal reasoning, significantly improving over existing methods.

Contribution

It introduces a pioneering conversational music recommendation system with integrated reasoning and explanation capabilities using large language models and multi-modal inputs.

Findings

01

MuseChat outperforms existing video-based music retrieval methods.

02

The system offers strong interpretability and user interaction.

03

A large-scale dataset was created for evaluation.

Abstract

Music recommendation for videos attracts growing interest in multi-modal research. However, existing systems focus primarily on content compatibility, often ignoring the users' preferences. Their inability to interact with users for further refinements or to provide explanations leads to a less satisfying experience. We address these issues with MuseChat, a first-of-its-kind dialogue-based recommendation system that personalizes music suggestions for videos. Our system consists of two key functionalities with associated modules: recommendation and reasoning. The recommendation module takes a video along with optional information including previous suggested music and user's preference as inputs and retrieves an appropriate music matching the context. The reasoning module, equipped with the power of Large Language Model (Vicuna-7B) and extended to multi-modal inputs, is able to provide…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Dongzhikang/MuseChat-dataset
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Neuroscience and Music Perception · Music Technology and Sound Studies

MethodsFocus