An Exploration of Task-decoupling on Two-stage Neural Post Filter for   Real-time Personalized Acoustic Echo Cancellation

Zihan Zhang; Jiayao Sun; Xianjun Xia; Ziqian Wang; Xiaopeng Yan,; Yijian Xiao; Lei Xie

arXiv:2310.04715·eess.AS·October 10, 2023

An Exploration of Task-decoupling on Two-stage Neural Post Filter for Real-time Personalized Acoustic Echo Cancellation

Zihan Zhang, Jiayao Sun, Xianjun Xia, Ziqian Wang, Xiaopeng Yan,, Yijian Xiao, Lei Xie

PDF

Open Access

TL;DR

This paper introduces a two-stage task-decoupling post-filter for personalized acoustic echo cancellation, utilizing multi-scale speaker representations to improve performance over joint models in real-time applications.

Contribution

It proposes a novel two-stage task-decoupling approach with multi-scale speaker features, enhancing PAEC effectiveness compared to traditional joint models.

Findings

01

Task-decoupling outperforms joint network models.

02

Decoupling echo cancellation from noise suppression yields better results.

03

Optimal training strategies are identified for the two-stage model.

Abstract

Deep learning based techniques have been popularly adopted in acoustic echo cancellation (AEC). Utilization of speaker representation has extended the frontier of AEC, thus attracting many researchers' interest in personalized acoustic echo cancellation (PAEC). Meanwhile, task-decoupling strategies are widely adopted in speech enhancement. To further explore the task-decoupling approach, we propose to use a two-stage task-decoupling post-filter (TDPF) in PAEC. Furthermore, a multi-scale local-global speaker representation is applied to improve speaker extraction in PAEC. Experimental results indicate that the task-decoupling model can yield better performance than a single joint network. The optimal approach is to decouple the echo cancellation from noise and interference speech suppression. Based on the task-decoupling sequence, optimal training strategies for the two-stage model are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Speech Recognition and Synthesis