MusicAgent: An AI Agent for Music Understanding and Generation with   Large Language Models

Dingyao Yu; Kaitao Song; Peiling Lu; Tianyu He; Xu Tan; Wei Ye; Shikun; Zhang; Jiang Bian

arXiv:2310.11954·cs.CL·October 26, 2023·2 cites

MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

Dingyao Yu, Kaitao Song, Peiling Lu, Tianyu He, Xu Tan, Wei Ye, Shikun, Zhang, Jiang Bian

PDF

Open Access 1 Repo

TL;DR

MusicAgent is an AI system that leverages large language models to organize, automate, and simplify diverse music processing tasks, enabling users to focus on creativity without managing complex tools.

Contribution

The paper introduces MusicAgent, a novel system that integrates multiple music tools and uses LLMs to automate task decomposition and tool invocation for music understanding and generation.

Findings

01

Successfully integrates tools from multiple sources.

02

Automates task decomposition and tool invocation.

03

Enhances user experience by simplifying AI-music interactions.

Abstract

AI-empowered music processing is a diverse field that encompasses dozens of tasks, ranging from generation tasks (e.g., timbre synthesis) to comprehension tasks (e.g., music classification). For developers and amateurs, it is very difficult to grasp all of these task to satisfy their requirements in music processing, especially considering the huge differences in the representations of music data and the model applicability across platforms among various tasks. Consequently, it is necessary to build a system to organize and integrate these tasks, and thus help practitioners to automatically analyze their demand and call suitable tools as solutions to fulfill their requirements. Inspired by the recent success of large language models (LLMs) in task automation, we develop a system, named MusicAgent, which integrates numerous music-related tools and an autonomous workflow to address user…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/muzic
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Topic Modeling · Speech Recognition and Synthesis