High-Fidelity Audio Generation and Representation Learning with Guided   Adversarial Autoencoder

Kazi Nazmul Haque; Rajib Rana; Bj\"orn W Schuller

arXiv:2006.00877·eess.AS·October 20, 2020

High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder

Kazi Nazmul Haque, Rajib Rana, Bj\"orn W Schuller

PDF

TL;DR

The paper introduces Guided Adversarial Autoencoder (GAAE), a novel model that learns both task-specific and general audio representations from unlabeled data, while generating high-fidelity audio indistinguishable from real samples.

Contribution

The GAAE model combines unsupervised and semi-supervised learning to produce high-quality audio and versatile representations suitable for multiple downstream tasks.

Findings

01

GAAE achieves high-fidelity audio generation comparable to real samples.

02

It learns effective representations with minimal labeled data.

03

The model demonstrates improved generalization across related tasks.

Abstract

Unsupervised disentangled representation learning from the unlabelled audio data, and high fidelity audio generation have become two linchpins in the machine learning research fields. However, the representation learned from an unsupervised setting does not guarantee its' usability for any downstream task at hand, which can be a wastage of the resources, if the training was conducted for that particular posterior job. Also, during the representation learning, if the model is highly biased towards the downstream task, it losses its generalisation capability which directly benefits the downstream job but the ability to scale it to other related task is lost. Therefore, to fill this gap, we propose a new autoencoder based model named "Guided Adversarial Autoencoder (GAAE)", which can learn both post-task-specific representations and the general representation capturing the factors of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSolana Customer Service Number +1-833-534-1729