Lightweight Speech Enhancement in Unseen Noisy and Reverberant Conditions using KISS-GEV Beamforming
Thomas Bernard, Fran\c{c}ois Grondin

TL;DR
This paper presents KISS-GEV, a simple and computationally efficient beamforming method for speech enhancement that relies on direction of arrival, working effectively in unseen noisy and reverberant conditions without neural network training.
Contribution
The paper introduces KISS-GEV, a novel signal processing approach for speech enhancement that reduces computation and eliminates the need for neural network training in unseen conditions.
Findings
Outperforms traditional Delay-and-Sum beamforming.
Works effectively in unseen noisy and reverberant environments.
Requires minimal DoA assumptions.
Abstract
This paper introduces a new method referred to as KISS-GEV (for Keep It Super Simple Generalized eigenvalue) beamforming. While GEV beamforming usually relies on deep neural network for estimating target and noise time-frequency masks, this method uses a signal processing approach based on the direction of arrival (DoA) of the target. This considerably reduces the amount of computations involved at test time, and works for speech enhancement in unseen conditions as there is no need to train a neural network with noisy speech. The proposed method can also be used to separate speech from a mixture, provided the speech sources come from different directions. Results also show that the proposed method uses the same minimal DoA assumption as Delay-and-Sum beamforming, yet outperforms this traditional approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Speech Recognition and Synthesis
MethodsTest
