CycleFlow: Purify Information Factors by Cycle Loss

Haoran Sun; Chen Chen; Lantian Li; Dong Wang

arXiv:2110.09928·eess.AS·October 22, 2021·1 cites

CycleFlow: Purify Information Factors by Cycle Loss

Haoran Sun, Chen Chen, Lantian Li, Dong Wang

PDF

Open Access

TL;DR

CycleFlow enhances speech factor disentanglement by combining cycle loss and random substitution, leading to improved voice conversion and speech editing capabilities over previous IB-based models.

Contribution

It introduces a novel CycleFlow model that effectively reduces mutual information among factors, improving upon SpeechFlow for speech disentanglement and editing.

Findings

01

Better voice conversion performance than SpeechFlow

02

Effective reduction of mutual information among factors

03

Demonstrated utility in speech editing and emotion perception

Abstract

SpeechFlow is a powerful factorization model based on information bottleneck (IB), and its effectiveness has been reported by several studies. A potential problem of SpeechFlow, however, is that if the IB channels are not well designed, the resultant factors cannot be well disentangled. In this study, we propose a CycleFlow model that combines random factor substitution and cycle loss to solve this problem. Experiments on voice conversion tasks demonstrate that this simple technique can effectively reduce mutual information among individual factors, and produce clearly better conversion than the IB-based SpeechFlow. CycleFlow can also be used as a powerful tool for speech editing. We demonstrate this usage by an emotion perception experiment.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Speech Recognition and Synthesis