Universal audio synthesizer control with normalizing flows

Philippe Esling; Naotake Masuda; Adrien Bardet; Romeo Despres; Axel; Chemla--Romeu-Santos

arXiv:1907.00971·cs.LG·July 3, 2019·34 cites

Universal audio synthesizer control with normalizing flows

Philippe Esling, Naotake Masuda, Adrien Bardet, Romeo Despres, Axel, Chemla--Romeu-Santos

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel control method for audio synthesizers using normalizing flows, enabling automatic parameter inference, macro-control learning, and preset exploration within a unified model, improving interpretability and usability.

Contribution

It proposes a new formulation of synthesizer control with invertible mappings using VAEs and normalizing flows, including disentangling flows for better organization of latent spaces.

Findings

01

Superior parameter inference and audio reconstruction compared to baselines

02

Disentangles major audio variation factors as latent dimensions

03

Enables semantic control and smooth parameter mapping

Abstract

The ubiquity of sound synthesizers has reshaped music production and even entirely defined new music genres. However, the increasing complexity and number of parameters in modern synthesizers make them harder to master. Hence, the development of methods allowing to easily create and explore with synthesizers is a crucial need. Here, we introduce a novel formulation of audio synthesizer control. We formalize it as finding an organized latent audio space that represents the capabilities of a synthesizer, while constructing an invertible mapping to the space of its parameters. By using this formulation, we show that we can address simultaneously automatic parameter inference, macro-control learning and audio-based preset exploration within a single model. To solve this new formulation, we rely on Variational Auto-Encoders (VAE) and Normalizing Flows (NF) to organize and map the respective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

acids-ircam/flow_synthesizer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic Technology and Sound Studies · Music and Audio Processing · Model Reduction and Neural Networks

MethodsNormalizing Flows