Jointly optimal dereverberation and beamforming
Christoph Boeddeker, Tomohiro Nakatani, Keisuke Kinoshita, Reinhold, Haeb-Umbach

TL;DR
This paper analyzes a convolutional beamformer for speech enhancement, revealing that its superior performance mainly stems from its wMPDR component, and introduces a new derivation to clarify this contribution.
Contribution
The paper provides a new derivation of the convolutional beamformer, showing it can be factorized into a WPE dereverberation filter and a wMPDR beamformer, clarifying the source of its effectiveness.
Findings
Superiority of the convolutional beamformer is primarily due to its wMPDR component.
The new derivation allows factorization into WPE and wMPDR components without loss of optimality.
Experimental results confirm the importance of the wMPDR part in performance gains.
Abstract
We previously proposed an optimal (in the maximum likelihood sense) convolutional beamformer that can perform simultaneous denoising and dereverberation, and showed its superiority over the widely used cascade of a WPE dereverberation filter and a conventional MPDR beamformer. However, it has not been fully investigated which components in the convolutional beamformer yield such superiority. To this end, this paper presents a new derivation of the convolutional beamformer that allows us to factorize it into a WPE dereverberation filter, and a special type of a (non-convolutional) beamformer, referred to as a wMPDR beamformer, without loss of optimality. With experiments, we show that the superiority of the convolutional beamformer in fact comes from its wMPDR part.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
