Posterior sampling algorithms for unsupervised speech enhancement with recurrent variational autoencoder
Mostafa Sadeghi (MULTISPEECH), Romain Serizel (MULTISPEECH)

TL;DR
This paper introduces efficient sampling algorithms for unsupervised speech enhancement using RVAE, improving computational efficiency and robustness over traditional variational inference methods and supervised approaches.
Contribution
It proposes Langevin dynamics and Metropolis-Hasting sampling techniques to replace variational inference in RVAE-based speech enhancement, reducing complexity and enhancing performance.
Findings
Sampling methods outperform VEM in efficiency and accuracy
Proposed algorithms generalize well to mismatched conditions
Outperform supervised diffusion model approaches
Abstract
In this paper, we address the unsupervised speech enhancement problem based on recurrent variational autoencoder (RVAE). This approach offers promising generalization performance over the supervised counterpart. Nevertheless, the involved iterative variational expectation-maximization (VEM) process at test time, which relies on a variational inference method, results in high computational complexity. To tackle this issue, we present efficient sampling techniques based on Langevin dynamics and Metropolis-Hasting algorithms, adapted to the EM-based speech enhancement with RVAE. By directly sampling from the intractable posterior distribution within the EM process, we circumvent the intricacies of variational inference. We conduct a series of experiments, comparing the proposed methods with VEM and a state-of-the-art supervised speech enhancement approach based on diffusion models. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Adaptive Filtering Techniques
MethodsVariational Inference · Diffusion
