Sound Model Factory: An Integrated System Architecture for Generative   Audio Modelling

Lonce Wyse; Purnima Kamath; Chitralekha Gupta

arXiv:2206.13085·cs.SD·June 28, 2022

Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling

Lonce Wyse, Purnima Kamath, Chitralekha Gupta

PDF

Open Access

TL;DR

This paper presents a novel integrated system combining GAN and RNN architectures to generate controllable, high-quality audio models that interpolate smoothly between sounds, expanding capabilities beyond traditional musical sounds.

Contribution

The system uniquely integrates GAN and RNN to enable interactive sound modeling with smooth interpolation, advancing generative audio modeling beyond existing methods.

Findings

01

Effective GAN-RNN integration for sound synthesis

02

User studies validate perceptually smooth sound interpolation

03

Expanded audio modeling to textures beyond pitch and percussion

Abstract

We introduce a new system for data-driven audio sound model design built around two different neural network architectures, a Generative Adversarial Network(GAN) and a Recurrent Neural Network (RNN), that takes advantage of the unique characteristics of each to achieve the system objectives that neither is capable of addressing alone. The objective of the system is to generate interactively controllable sound models given (a) a range of sounds the model should be able to synthesize, and (b) a specification of the parametric controls for navigating that space of sounds. The range of sounds is defined by a dataset provided by the designer, while the means of navigation is defined by a combination of data labels and the selection of a sub-manifold from the latent space learned by the GAN. Our proposed system takes advantage of the rich latent space of a GAN that consists of sounds that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing