Dynamic Acoustic Compensation and Adaptive Focal Training for Personalized Speech Enhancement
Xiaofeng Ge, Jiangyu Han, Haixin Guan, Yanhua Long

TL;DR
This paper introduces dynamic acoustic compensation and adaptive focal training to improve personalized speech enhancement, addressing environment mismatch and hard sample learning, resulting in significant performance gains on the DNS4 dataset.
Contribution
It proposes novel methods for acoustic environment adaptation and hard sample learning, enhancing the generalization and effectiveness of personalized speech enhancement systems.
Findings
DAC significantly improves evaluation metrics
AFT reduces hard sample rate
MOS scores are notably increased
Abstract
Recently, more and more personalized speech enhancement systems (PSE) with excellent performance have been proposed. However, two critical issues still limit the performance and generalization ability of the model: 1) Acoustic environment mismatch between the test noisy speech and target speaker enrollment speech; 2) Hard sample mining and learning. In this paper, dynamic acoustic compensation (DAC) is proposed to alleviate the environment mismatch, by intercepting the noise or environmental acoustic segments from noisy speech and mixing it with the clean enrollment speech. To well exploit the hard samples in training data, we propose an adaptive focal training (AFT) strategy by assigning adaptive loss weights to hard and non-hard samples during training. A time-frequency multi-loss training is further introduced to improve and generalize our previous work sDPCCN for PSE. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing
MethodsTest · Dynamic Algorithm Configuration · Adaptive Robust Loss
