# Fast MVAE: Joint separation and classification of mixed sources based on   multichannel variational autoencoder with auxiliary classifier

**Authors:** Li Li, Hirokazu Kameoka, Shoji Makino

arXiv: 1812.06391 · 2019-02-14

## TL;DR

Fast MVAE (fMVAE) enhances multichannel source separation by significantly reducing computational time and improving classification accuracy through an auxiliary classifier VAE, maintaining competitive separation performance.

## Contribution

The paper introduces fMVAE, a novel algorithm that combines an auxiliary classifier VAE with MVAE to improve efficiency and accuracy in source separation and classification.

## Key findings

- Achieved 80% source classification accuracy.
- Reduced computational time by approximately 93%.
- Maintained comparable source separation performance to MVAE.

## Abstract

This paper proposes an alternative algorithm for multichannel variational autoencoder (MVAE), a recently proposed multichannel source separation approach. While MVAE is notable in its impressive source separation performance, the convergence-guaranteed optimization algorithm and that it allows us to estimate source-class labels simultaneously with source separation, there are still two major drawbacks, i.e., the high computational complexity and unsatisfactory source classification accuracy. To overcome these drawbacks, the proposed method employs an auxiliary classifier VAE, an information-theoretic extension of the conditional VAE, for learning the generative model of the source spectrograms. Furthermore, with the trained auxiliary classifier, we introduce a novel algorithm for the optimization that is able to not only reduce the computational time but also improve the source classification performance. We call the proposed method "fast MVAE (fMVAE)". Experimental evaluations revealed that fMVAE achieved comparative source separation performance to MVAE and about 80% source classification accuracy rate while it reduced about 93% computational time.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.06391/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1812.06391/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/1812.06391/full.md

---
Source: https://tomesphere.com/paper/1812.06391