The NPU-Elevoc Personalized Speech Enhancement System for ICASSP2023 DNS   Challenge

Xiaopeng Yan; Yindi Yang; Zhihao Guo; Liangliang Peng; Lei Xie

arXiv:2303.06811·eess.AS·March 16, 2023·ICASSP·1 cites

The NPU-Elevoc Personalized Speech Enhancement System for ICASSP2023 DNS Challenge

Xiaopeng Yan, Yindi Yang, Zhihao Guo, Liangliang Peng, Lei Xie

PDF

Open Access 1 Repo

TL;DR

This paper presents NPU-Elevoc's personalized speech enhancement system for ICASSP 2023, achieving top rankings by improving speaker embedding fusion, training strategies, and leveraging adversarial and multi-scale loss techniques.

Contribution

The paper introduces enhancements to the TEA-PSE 2.0 model, including advanced speaker embedding fusion and training optimizations, for improved speech enhancement performance.

Findings

01

Tied for 1st in headset track at ICASSP 2023 challenge.

02

Ranked 2nd in speakerphone track at ICASSP 2023.

03

Demonstrated effectiveness of adversarial training and multi-scale loss.

Abstract

This paper describes our NPU-Elevoc personalized speech enhancement system (NAPSE) for the 5th Deep Noise Suppression Challenge at ICASSP 2023. Based on the superior two-stage model TEA-PSE 2.0, our system particularly explores better strategy for speaker embedding fusion, optimizes the model training pipeline, and leverages adversarial training and multi-scale loss. According to the results, our system is tied for the 1st place in the headset track (track 1) and ranked 2nd in the speakerphone track (track 2).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/DNS-Challenge
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing