Linearly Constrained Deep Beamformer for Multi-Speaker Scenarios
Ilai Zaidel, Ori Engel, Bar Engel, Sharon Gannot

TL;DR
This paper introduces a deep neural network-based beamforming method that enhances target speakers in multi-speaker environments by satisfying linear spatial constraints, outperforming classical beamformers in noise suppression and interference nulling.
Contribution
The paper presents a novel deep beamforming framework that directly estimates spatially constrained weights using a DNN guided by RTF and interference subspace, improving multi-speaker separation.
Findings
Achieves superior enhancement compared to classical LCMV beamformer.
Produces more controlled sidelobes and better background noise attenuation.
Effectively directs nulls toward interfering sources while focusing on the target speaker.
Abstract
We propose a deep beamforming framework for enhancing target speaker(s) in multi-speaker environments. A deep neural network (DNN) is trained to estimate beamforming weights directly from noisy multichannel inputs while satisfying linear spatial constraints through an adaptive multi-term loss inspired by the augmented Lagrangian framework. The loss combines signal reconstruction with penalties that enforce a distortionless response toward the target and suppress the interference subspace. The model is further guided by the target relative transfer function (RTF) and the estimated interference subspace. The proposed model can direct a beam toward the target speaker while directing nulls toward the interfering sources, achieving superior overall enhancement performance compared with the classical LCMV beamformer constructed by the same estimated spatial signatures. Furthermore, compared…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
