NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional   Resampling

Chi-Chang Lee; Cheng-Hung Hu; Yu-Chen Lin; Chu-Song Chen; Hsin-Min; Wang; Yu Tsao

arXiv:2206.09058·eess.AS·June 22, 2022

NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling

Chi-Chang Lee, Cheng-Hung Hu, Yu-Chen Lin, Chu-Song Chen, Hsin-Min, Wang, Yu Tsao

PDF

Open Access

TL;DR

NASTAR introduces a one-shot noise adaptation method for speech enhancement that uses noise extraction and retrieval to improve performance in target environments with minimal data.

Contribution

This work is the first to combine noise extraction and retrieval for one-shot noise adaptation in speech enhancement models.

Findings

01

NASTAR effectively adapts to target noise with only one noisy sample.

02

Both noise extractor and retrieval model improve adaptation performance.

03

Experimental results demonstrate significant enhancement improvements.

Abstract

For deep learning-based speech enhancement (SE) systems, the training-test acoustic mismatch can cause notable performance degradation. To address the mismatch issue, numerous noise adaptation strategies have been derived. In this paper, we propose a novel method, called noise adaptive speech enhancement with target-conditional resampling (NASTAR), which reduces mismatches with only one sample (one-shot) of noisy speech in the target environment. NASTAR uses a feedback mechanism to simulate adaptive training data via a noise extractor and a retrieval model. The noise extractor estimates the target noise from the noisy speech, called pseudo-noise. The noise retrieval model retrieves relevant noise samples from a pool of noise signals according to the noisy speech, called relevant-cohort. The pseudo-noise and the relevant-cohort set are jointly sampled and mixed with the source speech…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Acoustic Wave Phenomena Research