PDPCRN: Parallel Dual-Path CRN with Bi-directional Inter-Branch   Interactions for Multi-Channel Speech Enhancement

Jiahui Pan; Shulin He; Tianci Wu; Hui Zhang; Xueliang Zhang

arXiv:2309.10379·cs.SD·September 20, 2023

PDPCRN: Parallel Dual-Path CRN with Bi-directional Inter-Branch Interactions for Multi-Channel Speech Enhancement

Jiahui Pan, Shulin He, Tianci Wu, Hui Zhang, Xueliang Zhang

PDF

Open Access

TL;DR

This paper introduces PDPCRN, a novel multi-channel speech enhancement model with parallel branches and bi-directional interactions, improving performance and efficiency over existing dual-path networks.

Contribution

The paper proposes a parallel dual-path architecture with bi-directional inter-branch interactions for better multi-channel speech enhancement.

Findings

01

Outperforms baseline models in PESQ and STOI metrics.

02

Achieves higher enhancement quality with fewer parameters.

03

Demonstrates effectiveness on TIMIT dataset.

Abstract

Multi-channel speech enhancement seeks to utilize spatial information to distinguish target speech from interfering signals. While deep learning approaches like the dual-path convolutional recurrent network (DPCRN) have made strides, challenges persist in effectively modeling inter-channel correlations and amalgamating multi-level information. In response, we introduce the Parallel Dual-Path Convolutional Recurrent Network (PDPCRN). This acoustic modeling architecture has two key innovations. First, a parallel design with separate branches extracts complementary features. Second, bi-directional modules enable cross-branch communication. Together, these facilitate diverse representation fusion and enhanced modeling. Experimental validation on TIMIT datasets underscores the prowess of PDPCRN. Notably, against baseline models like the standard DPCRN, PDPCRN not only outperforms in PESQ and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Phonetics and Phonology Research