Semi-blind source separation with multichannel variational autoencoder

Hirokazu Kameoka; Li Li; Shota Inoue; Shoji Makino

arXiv:1808.00892·stat.ML·August 28, 2018·36 cites

Semi-blind source separation with multichannel variational autoencoder

Hirokazu Kameoka, Li Li, Shota Inoue, Shoji Makino

PDF

Open Access 1 Repo

TL;DR

This paper introduces MVAE, a multichannel source separation method using a conditional variational autoencoder trained on spectrograms with class labels, enabling semi-blind separation with improved performance.

Contribution

The paper presents a novel semi-blind source separation algorithm based on CVAE that models source spectrograms conditioned on class labels, with guaranteed convergence.

Findings

01

MVAE outperforms baseline separation methods in experiments.

02

The approach effectively models source spectrograms conditioned on class labels.

03

The algorithm guarantees convergence during iterative estimation.

Abstract

This paper proposes a multichannel source separation technique called the multichannel variational autoencoder (MVAE) method, which uses a conditional VAE (CVAE) to model and estimate the power spectrograms of the sources in a mixture. By training the CVAE using the spectrograms of training examples with source-class labels, we can use the trained decoder distribution as a universal generative model capable of generating spectrograms conditioned on a specified class label. By treating the latent space variables and the class label as the unknown parameters of this generative model, we can develop a convergence-guaranteed semi-blind source separation algorithm that consists of iteratively estimating the power spectrograms of the underlying sources as well as the separation matrices. In experimental evaluations, our MVAE produced better separation performance than a baseline method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mori97/MVAE
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Speech Recognition and Synthesis

MethodsConditional Variational Auto Encoder · Solana Customer Service Number +1-833-534-1729