Interpretable Timbre Synthesis using Variational Autoencoders   Regularized on Timbre Descriptors

Anastasia Natsiou; Luca Longo; Sean O'Leary

arXiv:2307.10283·cs.SD·July 21, 2023

Interpretable Timbre Synthesis using Variational Autoencoders Regularized on Timbre Descriptors

Anastasia Natsiou, Luca Longo, Sean O'Leary

PDF

Open Access

TL;DR

This paper introduces a regularized VAE model for timbre synthesis that incorporates timbre descriptors and harmonic content to improve interpretability and control over sound generation.

Contribution

It proposes a novel regularization technique for VAEs that integrates timbre descriptors and harmonic content for more interpretable and concise timbre synthesis.

Findings

01

Enhanced interpretability of latent space

02

Reduced dimensionality of sound representation

03

Improved control over timbre synthesis

Abstract

Controllable timbre synthesis has been a subject of research for several decades, and deep neural networks have been the most successful in this area. Deep generative models such as Variational Autoencoders (VAEs) have the ability to generate a high-level representation of audio while providing a structured latent space. Despite their advantages, the interpretability of these latent spaces in terms of human perception is often limited. To address this limitation and enhance the control over timbre generation, we propose a regularized VAE-based latent space that incorporates timbre descriptors. Moreover, we suggest a more concise representation of sound by utilizing its harmonic content, in order to minimize the dimensionality of the latent space.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing