Semi-blind source separation with multichannel variational autoencoder
Hirokazu Kameoka, Li Li, Shota Inoue, Shoji Makino

TL;DR
This paper introduces MVAE, a multichannel source separation method using a conditional variational autoencoder trained on spectrograms with class labels, enabling semi-blind separation with improved performance.
Contribution
The paper presents a novel semi-blind source separation algorithm based on CVAE that models source spectrograms conditioned on class labels, with guaranteed convergence.
Findings
MVAE outperforms baseline separation methods in experiments.
The approach effectively models source spectrograms conditioned on class labels.
The algorithm guarantees convergence during iterative estimation.
Abstract
This paper proposes a multichannel source separation technique called the multichannel variational autoencoder (MVAE) method, which uses a conditional VAE (CVAE) to model and estimate the power spectrograms of the sources in a mixture. By training the CVAE using the spectrograms of training examples with source-class labels, we can use the trained decoder distribution as a universal generative model capable of generating spectrograms conditioned on a specified class label. By treating the latent space variables and the class label as the unknown parameters of this generative model, we can develop a convergence-guaranteed semi-blind source separation algorithm that consists of iteratively estimating the power spectrograms of the underlying sources as well as the separation matrices. In experimental evaluations, our MVAE produced better separation performance than a baseline method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Speech Recognition and Synthesis
MethodsConditional Variational Auto Encoder · Solana Customer Service Number +1-833-534-1729
