CycleGAN with Dual Adversarial Loss for Bone-Conducted Speech   Enhancement

Qing Pan; Teng Gao; Jian Zhou; Huabin Wang; Liang Tao; Hon Keung Kwan

arXiv:2111.01430·cs.SD·November 3, 2021·1 cites

CycleGAN with Dual Adversarial Loss for Bone-Conducted Speech Enhancement

Qing Pan, Teng Gao, Jian Zhou, Huabin Wang, Liang Tao, Hon Keung Kwan

PDF

Open Access

TL;DR

This paper introduces a novel CycleGAN-based method with dual adversarial loss for enhancing bone-conducted speech, improving quality and intelligibility without needing air-conducted speech data.

Contribution

It proposes a dual adversarial loss mechanism within CycleGAN for bone-conducted speech enhancement, eliminating the need for air-conducted speech and reducing oversmoothness.

Findings

01

Outperforms baseline methods like CycleGAN, GMM, and BLSTM

02

Effectively enhances speech quality and intelligibility

03

Avoids oversmoothness common in statistical models

Abstract

Compared with air-conducted speech, bone-conducted speech has the unique advantage of shielding background noise. Enhancement of bone-conducted speech helps to improve its quality and intelligibility. In this paper, a novel CycleGAN with dual adversarial loss (CycleGAN-DAL) is proposed for bone-conducted speech enhancement. The proposed method uses an adversarial loss and a cycle-consistent loss simultaneously to learn forward and cyclic mapping, in which the adversarial loss is replaced with the classification adversarial loss and the defect adversarial loss to consolidate the forward mapping. Compared with conventional baseline methods, it can learn feature mapping between bone-conducted speech and target speech without additional air-conducted speech assistance. Moreover, the proposed method also avoids the oversmooth problem which is occurred commonly in conventional statistical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing