Cycle-Consistent Speech Enhancement
Zhong Meng, Jinyu Li, Yifan Gong, Biing-Hwang (Fred) Juang

TL;DR
This paper introduces a cycle-consistent speech enhancement method that uses inverse mappings and adversarial training to better preserve speech structure and improve noise reduction, especially with limited paired data.
Contribution
The paper proposes a novel cycle-consistent framework for speech enhancement that maintains speech structure and enhances generalization to unseen data.
Findings
Achieves 19.60% relative WER reduction with paired data.
Achieves 6.69% relative WER reduction without paired data.
Effectively preserves speech features while reducing noise.
Abstract
Feature mapping using deep neural networks is an effective approach for single-channel speech enhancement. Noisy features are transformed to the enhanced ones through a mapping network and the mean square errors between the enhanced and clean features are minimized. In this paper, we propose a cycle-consistent speech enhancement (CSE) in which an additional inverse mapping network is introduced to reconstruct the noisy features from the enhanced ones. A cycle-consistent constraint is enforced to minimize the reconstruction loss. Similarly, a backward cycle of mappings is performed in the opposite direction with the same networks and losses. With cycle-consistency, the speech structure is well preserved in the enhanced features while noise is effectively reduced such that the feature-mapping network generalizes better to unseen data. In cases where only unparalleled noisy and clean data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
