A Demand-Driven Perspective on Generative Audio AI

Sangshin Oh; Minsung Kang; Hyeongi Moon; Keunwoo Choi; Ben Sangbae; Chon

arXiv:2307.04292·eess.AS·July 11, 2023·1 cites

A Demand-Driven Perspective on Generative Audio AI

Sangshin Oh, Minsung Kang, Hyeongi Moon, Keunwoo Choi, Ben Sangbae, Chon

PDF

Open Access

TL;DR

This paper surveys professional audio engineers to identify industry demands and research priorities in generative audio AI, highlighting dataset availability as a key bottleneck and proposing potential solutions.

Contribution

It provides a demand-driven analysis of generative audio AI, emphasizing industry needs, challenges, and potential solutions based on survey data.

Findings

01

Dataset availability is the main bottleneck for high-quality audio generation.

02

Industry demands prioritize controllability and audio quality.

03

Proposed solutions are supported by empirical evidence.

Abstract

To achieve successful deployment of AI research, it is crucial to understand the demands of the industry. In this paper, we present the results of a survey conducted with professional audio engineers, in order to determine research priorities and define various research tasks. We also summarize the current challenges in audio quality and controllability based on the survey. Our analysis emphasizes that the availability of datasets is currently the main bottleneck for achieving high-quality audio generation. Finally, we suggest potential solutions for some revealed issues with empirical evidence.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing