A Review of Differentiable Digital Signal Processing for Music & Speech   Synthesis

Ben Hayes; Jordie Shier; Gy\"orgy Fazekas; Andrew McPherson,; Charalampos Saitis

arXiv:2308.15422·cs.SD·August 30, 2023

A Review of Differentiable Digital Signal Processing for Music & Speech Synthesis

Ben Hayes, Jordie Shier, Gy\"orgy Fazekas, Andrew McPherson,, Charalampos Saitis

PDF

Open Access

TL;DR

This survey reviews differentiable digital signal processing techniques in music and speech synthesis, highlighting applications, implemented operations, challenges, and future research directions.

Contribution

It provides a comprehensive overview of differentiable DSP methods in audio synthesis, cataloging applications, operations, and discussing open challenges and future research avenues.

Findings

01

Differentiable DSP enables backpropagation through audio processing.

02

Applications include music rendering, sound matching, and voice transformation.

03

Open challenges involve optimization issues and robustness to real-world conditions.

Abstract

The term "differentiable digital signal processing" describes a family of techniques in which loss function gradients are backpropagated through digital signal processors, facilitating their integration into neural networks. This article surveys the literature on differentiable audio signal processing, focusing on its use in music & speech synthesis. We catalogue applications to tasks including music performance rendering, sound matching, and voice transformation, discussing the motivations for and implications of the use of this methodology. This is accompanied by an overview of digital signal processing operations that have been implemented differentiably. Finally, we highlight open challenges, including optimisation pathologies, robustness to real-world conditions, and design trade-offs, and discuss directions for future research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Speech and Audio Processing