Dynamic Acoustic Compensation and Adaptive Focal Training for   Personalized Speech Enhancement

Xiaofeng Ge; Jiangyu Han; Haixin Guan; Yanhua Long

arXiv:2211.12097·eess.AS·November 23, 2022·1 cites

Dynamic Acoustic Compensation and Adaptive Focal Training for Personalized Speech Enhancement

Xiaofeng Ge, Jiangyu Han, Haixin Guan, Yanhua Long

PDF

Open Access

TL;DR

This paper introduces dynamic acoustic compensation and adaptive focal training to improve personalized speech enhancement, addressing environment mismatch and hard sample learning, resulting in significant performance gains on the DNS4 dataset.

Contribution

It proposes novel methods for acoustic environment adaptation and hard sample learning, enhancing the generalization and effectiveness of personalized speech enhancement systems.

Findings

01

DAC significantly improves evaluation metrics

02

AFT reduces hard sample rate

03

MOS scores are notably increased

Abstract

Recently, more and more personalized speech enhancement systems (PSE) with excellent performance have been proposed. However, two critical issues still limit the performance and generalization ability of the model: 1) Acoustic environment mismatch between the test noisy speech and target speaker enrollment speech; 2) Hard sample mining and learning. In this paper, dynamic acoustic compensation (DAC) is proposed to alleviate the environment mismatch, by intercepting the noise or environmental acoustic segments from noisy speech and mixing it with the clean enrollment speech. To well exploit the hard samples in training data, we propose an adaptive focal training (AFT) strategy by assigning adaptive loss weights to hard and non-hard samples during training. A time-frequency multi-loss training is further introduced to improve and generalize our previous work sDPCCN for PSE. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing

MethodsTest · Dynamic Algorithm Configuration · Adaptive Robust Loss