A Fast Solver for Interpolating Stochastic Differential Equation Diffusion Models for Speech Restoration
Bunlong Lay, Timo Gerkmann

TL;DR
This paper introduces a novel fast sampling solver for interpolating stochastic differential equations (iSDEs), including models like SGMSE+ for speech restoration, significantly reducing neural network evaluations needed.
Contribution
It develops a formalism for iSDEs encompassing models like SGMSE+ and proposes a new solver enabling rapid sampling in speech restoration tasks.
Findings
Achieves fast sampling with as few as 10 neural network evaluations.
Effectively applies to multiple speech restoration tasks.
Reduces computational cost compared to traditional diffusion model sampling.
Abstract
Diffusion Probabilistic Models (DPMs) are a well-established class of diffusion models for unconditional image generation, while SGMSE+ is a well-established conditional diffusion model for speech enhancement. One of the downsides of diffusion models is that solving the reverse process requires many evaluations of a large Neural Network. Although advanced fast sampling solvers have been developed for DPMs, they are not directly applicable to models such as SGMSE+ due to differences in their diffusion processes. Specifically, DPMs transform between the data distribution and a standard Gaussian distribution, whereas SGMSE+ interpolates between the target distribution and a noisy observation. This work first develops a formalism of interpolating Stochastic Differential Equations (iSDEs) that includes SGMSE+, and second proposes a solver for iSDEs. The proposed solver enables fast sampling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Generative Adversarial Networks and Image Synthesis
