Lightweight Speech Enhancement in Unseen Noisy and Reverberant   Conditions using KISS-GEV Beamforming

Thomas Bernard; Fran\c{c}ois Grondin

arXiv:2110.03103·eess.AS·October 12, 2021

Lightweight Speech Enhancement in Unseen Noisy and Reverberant Conditions using KISS-GEV Beamforming

Thomas Bernard, Fran\c{c}ois Grondin

PDF

Open Access

TL;DR

This paper presents KISS-GEV, a simple and computationally efficient beamforming method for speech enhancement that relies on direction of arrival, working effectively in unseen noisy and reverberant conditions without neural network training.

Contribution

The paper introduces KISS-GEV, a novel signal processing approach for speech enhancement that reduces computation and eliminates the need for neural network training in unseen conditions.

Findings

01

Outperforms traditional Delay-and-Sum beamforming.

02

Works effectively in unseen noisy and reverberant environments.

03

Requires minimal DoA assumptions.

Abstract

This paper introduces a new method referred to as KISS-GEV (for Keep It Super Simple Generalized eigenvalue) beamforming. While GEV beamforming usually relies on deep neural network for estimating target and noise time-frequency masks, this method uses a signal processing approach based on the direction of arrival (DoA) of the target. This considerably reduces the amount of computations involved at test time, and works for speech enhancement in unseen conditions as there is no need to train a neural network with noisy speech. The proposed method can also be used to separate speech from a mixture, provided the speech sources come from different directions. Results also show that the proposed method uses the same minimal DoA assumption as Delay-and-Sum beamforming, yet outperforms this traditional approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Speech Recognition and Synthesis

MethodsTest