A Demand-Driven Perspective on Generative Audio AI
Sangshin Oh, Minsung Kang, Hyeongi Moon, Keunwoo Choi, Ben Sangbae, Chon

TL;DR
This paper surveys professional audio engineers to identify industry demands and research priorities in generative audio AI, highlighting dataset availability as a key bottleneck and proposing potential solutions.
Contribution
It provides a demand-driven analysis of generative audio AI, emphasizing industry needs, challenges, and potential solutions based on survey data.
Findings
Dataset availability is the main bottleneck for high-quality audio generation.
Industry demands prioritize controllability and audio quality.
Proposed solutions are supported by empirical evidence.
Abstract
To achieve successful deployment of AI research, it is crucial to understand the demands of the industry. In this paper, we present the results of a survey conducted with professional audio engineers, in order to determine research priorities and define various research tasks. We also summarize the current challenges in audio quality and controllability based on the survey. Our analysis emphasizes that the availability of datasets is currently the main bottleneck for achieving high-quality audio generation. Finally, we suggest potential solutions for some revealed issues with empirical evidence.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing
