Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling
Lonce Wyse, Purnima Kamath, Chitralekha Gupta

TL;DR
This paper presents a novel integrated system combining GAN and RNN architectures to generate controllable, high-quality audio models that interpolate smoothly between sounds, expanding capabilities beyond traditional musical sounds.
Contribution
The system uniquely integrates GAN and RNN to enable interactive sound modeling with smooth interpolation, advancing generative audio modeling beyond existing methods.
Findings
Effective GAN-RNN integration for sound synthesis
User studies validate perceptually smooth sound interpolation
Expanded audio modeling to textures beyond pitch and percussion
Abstract
We introduce a new system for data-driven audio sound model design built around two different neural network architectures, a Generative Adversarial Network(GAN) and a Recurrent Neural Network (RNN), that takes advantage of the unique characteristics of each to achieve the system objectives that neither is capable of addressing alone. The objective of the system is to generate interactively controllable sound models given (a) a range of sounds the model should be able to synthesize, and (b) a specification of the parametric controls for navigating that space of sounds. The range of sounds is defined by a dataset provided by the designer, while the means of navigation is defined by a combination of data labels and the selection of a sub-manifold from the latent space learned by the GAN. Our proposed system takes advantage of the rich latent space of a GAN that consists of sounds that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing
