Unconditional Audio Generation with Generative Adversarial Networks and   Cycle Regularization

Jen-Yu Liu; Yu-Hua Chen; Yin-Cheng Yeh; Yi-Hsuan Yang

arXiv:2005.08526·eess.AS·May 13, 2021

Unconditional Audio Generation with Generative Adversarial Networks and Cycle Regularization

Jen-Yu Liu, Yu-Hua Chen, Yin-Cheng Yeh, Yi-Hsuan Yang

PDF

1 Repo

TL;DR

This paper presents an improved GAN-based model for unconditional audio generation, incorporating hierarchical architecture and cycle regularization, achieving better quality in singing, speech, and musical instrument sounds.

Contribution

It introduces a hierarchical generator architecture and cycle regularization to enhance audio quality and prevent mode collapse in unconditional GAN audio generation.

Findings

01

Outperforms previous models in quality metrics

02

Effective in generating singing, speech, and musical instrument sounds

03

Cycle regularization reduces mode collapse

Abstract

In a recent paper, we have presented a generative adversarial network (GAN)-based model for unconditional generation of the mel-spectrograms of singing voices. As the generator of the model is designed to take a variable-length sequence of noise vectors as input, it can generate mel-spectrograms of variable length. However, our previous listening test shows that the quality of the generated audio leaves room for improvement. The present paper extends and expands that previous work in the following aspects. First, we employ a hierarchical architecture in the generator to induce some structure in the temporal dimension. Second, we introduce a cycle regularization mechanism to the generator to avoid mode collapse. Third, we evaluate the performance of the new model not only for generating singing voices, but also for generating speech voices. Evaluation result shows that new model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ciaua/unagan
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.