TaylorBeamformer: Learning All-Neural Beamformer for Multi-Channel   Speech Enhancement from Taylor's Approximation Theory

Andong Li; Guochen Yu; Chengshi Zheng; Xiaodong Li

arXiv:2203.07195·cs.SD·March 17, 2022

TaylorBeamformer: Learning All-Neural Beamformer for Multi-Channel Speech Enhancement from Taylor's Approximation Theory

Andong Li, Guochen Yu, Chengshi Zheng, Xiaodong Li

PDF

Open Access

TL;DR

TaylorBeamformer introduces a novel neural beamformer inspired by Taylor's approximation theory, decomposing the speech enhancement process into interpretable components and learning them end-to-end for improved performance.

Contribution

It proposes a new neural beamformer model based on Taylor's approximation, enhancing interpretability and performance in multi-channel speech enhancement.

Findings

01

Outperforms previous advanced baselines on LibriSpeech-based dataset.

02

Effectively decomposes the enhancement process into interpretable components.

03

Enables end-to-end training with trainable networks replacing derivatives.

Abstract

While existing end-to-end beamformers achieve impressive performance in various front-end speech processing tasks, they usually encapsulate the whole process into a black box and thus lack adequate interpretability. As an attempt to fill the blank, we propose a novel neural beamformer inspired by Taylor's approximation theory called TaylorBeamformer for multi-channel speech enhancement. The core idea is that the recovery process can be formulated as the spatial filtering in the neighborhood of the input mixture. Based on that, we decompose it into the superimposition of the 0th-order non-derivative and high-order derivative terms, where the former serves as the spatial filter and the latter is viewed as the residual noise canceller to further improve the speech quality. To enable end-to-end training, we replace the derivative operations with trainable networks and thus can learn from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Indoor and Outdoor Localization Technologies