Mixture of Inference Networks for VAE-based Audio-visual Speech Enhancement
Mostafa Sadeghi, Xavier Alameda-Pineda

TL;DR
This paper introduces MIN-VAE, a novel unsupervised audio-visual speech enhancement model that uses a mixture of inference networks to improve initialization and fusion of audio and visual data, leading to superior speech enhancement performance.
Contribution
The paper proposes MIN-VAE, a mixture of inference networks VAE that effectively fuses audio and visual data for speech enhancement, with an unsupervised learning of modality balance and improved initialization.
Findings
MIN-VAE outperforms standard audio-only models.
The mixture inference approach improves initialization.
The model adaptively fuses audio and visual data.
Abstract
In this paper, we are interested in unsupervised (unknown noise) audio-visual speech enhancement based on variational autoencoders (VAEs), where the probability distribution of clean speech spectra is simulated using an encoder-decoder architecture. The trained generative model (decoder) is then combined with a noise model at test time to estimate the clean speech. In the speech enhancement phase (test time), the initialization of the latent variables, which describe the generative process of clean speech via decoder, is crucial, as the overall inference problem is non-convex. This is usually done by using the output of the trained encoder where the noisy audio and clean visual data are given as input. Current audio-visual VAE models do not provide an effective initialization because the two modalities are tightly coupled (concatenated) in the associated architectures. To overcome this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest · Solana Customer Service Number +1-833-534-1729 · USD Coin Customer Service Number +1-833-534-1729
