Style Transfer for Non-differentiable Audio Effects
Kieran Grant

TL;DR
This paper introduces a deep learning method for style transfer in audio effects that works with non-differentiable effects and various effect classes, enabling more flexible audio production and creative control.
Contribution
It presents a novel audio embedding approach for style matching that accommodates effects in widely used frameworks without requiring auto-differentiation.
Findings
Successfully style matched a multi-band compressor effect.
Created logical timbral encodings for downstream tasks.
Demonstrated effectiveness through a listening test.
Abstract
Digital audio effects are widely used by audio engineers to alter the acoustic and temporal qualities of audio data. However, these effects can have a large number of parameters which can make them difficult to learn for beginners and hamper creativity for professionals. Recently, there have been a number of efforts to employ progress in deep learning to acquire the low-level parameter configurations of audio effects by minimising an objective function between an input and reference track, commonly referred to as style transfer. However, current approaches use inflexible black-box techniques or require that the effects under consideration are implemented in an auto-differentiation framework. In this work, we propose a deep learning approach to audio production style matching which can be used with effects implemented in some of the most widely used frameworks, requiring only that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
