Exploring Quality and Generalizability in Parameterized Neural Audio   Effects

William Mitchell; Scott H. Hawley

arXiv:2006.05584·eess.AS·June 11, 2020

Exploring Quality and Generalizability in Parameterized Neural Audio Effects

William Mitchell, Scott H. Hawley

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper advances neural audio effects modeling by improving architecture and training strategies to enhance quality, efficiency, and generalizability, especially for specific instrument datasets, aiming for professional-grade audio processing.

Contribution

It introduces methods to improve neural network modeling of nonlinear audio effects, focusing on dataset manipulation and architectural changes for better accuracy and efficiency.

Findings

01

Dataset manipulation with instrument-specific data improves accuracy.

02

Architectural and optimization changes enhance computational efficiency.

03

Model generalizability extends to a larger variety of nonlinear effects.

Abstract

Deep neural networks have shown promise for music audio signal processing applications, often surpassing prior approaches, particularly as end-to-end models in the waveform domain. Yet results to date have tended to be constrained by low sample rates, noise, narrow domains of signal types, and/or lack of parameterized controls (i.e. "knobs"), making their suitability for professional audio engineering workflows still lacking. This work expands on prior research published on modeling nonlinear time-dependent signal processing effects associated with music production by means of a deep neural network, one which includes the ability to emulate the parameterized settings you would see on an analog piece of equipment, with the goal of eventually producing commercially viable, high quality audio, i.e. 44.1 kHz sampling rate at 16-bit resolution. The results in this paper highlight progress in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

drscotthawley/signaltrain
pytorchOfficial

Datasets

drscotthawley/SignalTrain-LA2A
dataset· 3 dl
3 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing