One Billion Audio Sounds from GPU-enabled Modular Synthesis

Joseph Turian; Jordie Shier; George Tzanetakis; Kirk McNally; and Max Henry

arXiv:2104.12922·cs.SD·July 21, 2021·1 cites

One Billion Audio Sounds from GPU-enabled Modular Synthesis

Joseph Turian, Jordie Shier, George Tzanetakis, Kirk McNally, and Max Henry

PDF

Open Access 1 Repo

TL;DR

This paper introduces synth1B1, a massive dataset of 1 billion synthesized sounds with associated parameters, generated efficiently by a GPU-accelerated modular synthesizer, and demonstrates new evaluation and optimization methods for audio synthesis.

Contribution

The paper presents synth1B1, the largest synthesized audio dataset to date, along with torchsynth, an open-source GPU-based synthesizer, and new evaluation criteria and hyperparameter optimization techniques.

Findings

01

Synth1B1 is 100x larger than existing audio datasets.

02

Torchsynth generates samples 16200x faster than real-time on a GPU.

03

New rank-based evaluation criteria improve audio representation assessment.

Abstract

We release synth1B1, a multi-modal audio corpus consisting of 1 billion 4-second synthesized sounds, paired with the synthesis parameters used to generate them. The dataset is 100x larger than any audio dataset in the literature. We also introduce torchsynth, an open source modular synthesizer that generates the synth1B1 samples on-the-fly at 16200x faster than real-time (714MHz) on a single GPU. Finally, we release two new audio datasets: FM synth timbre and subtractive synth pitch. Using these datasets, we demonstrate new rank-based evaluation criteria for existing audio representations. Finally, we propose a novel approach to synthesizer hyperparameter optimization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

torchsynth/torchsynth
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies