Generative AI for Music and Audio

Hao-Wen Dong

arXiv:2411.14627·cs.SD·November 25, 2024

Generative AI for Music and Audio

Hao-Wen Dong

PDF

Open Access

TL;DR

This paper explores how generative AI can revolutionize music and audio creation by developing new tools and methods to democratize content creation and assist both amateurs and professionals.

Contribution

It introduces three main research directions: multitrack music generation, assistive creation tools, and multimodal learning for audio, advancing AI's role in music and audio production.

Findings

01

Development of models for multitrack music generation

02

Creation of assistive AI tools for music composition

03

Insights into AI learning processes for music creation

Abstract

Generative AI has been transforming the way we interact with technology and consume content. In the next decade, AI technology will reshape how we create audio content in various media, including music, theater, films, games, podcasts, and short videos. In this dissertation, I introduce the three main directions of my research centered around generative AI for music and audio: 1) multitrack music generation, 2) assistive music creation tools, and 3) multimodal learning for audio and music. Through my research, I aim to answer the following two fundamental questions: 1) How can AI help professionals or amateurs create music and audio content? 2) Can AI learn to create music in a way similar to how humans learn music? My long-term goal is to lower the barrier of entry for music composition and democratize audio content creation

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic Technology and Sound Studies · Music and Audio Processing