Signal-Aware Direction-of-Arrival Estimation Using Attention Mechanisms
Wolfgang Mack, Julian Wechsler, Emanu\"el A. P. Habets

TL;DR
This paper investigates the use of attention mechanisms in DOA estimation, demonstrating that combining attention with signal processing achieves comparable accuracy to fully neural network-based methods but with lower computational complexity.
Contribution
The paper evaluates different attention-based DOA systems and introduces end-to-end training strategies optimized for DOA estimation.
Findings
Attention improves DOA estimation in noisy and reverberant environments.
Signal-aware attention-based methods have lower computational complexity than fully neural approaches.
Proposed training strategies enhance the performance of attention-based DOA estimators.
Abstract
The direction-of-arrival (DOA) of sound sources is an essential acoustic parameter used, e.g., for multi-channel speech enhancement or source tracking. Complex acoustic scenarios consisting of sources-of-interest, interfering sources, reverberation, and noise make the estimation of the DOAs corresponding to the sources-of-interest a challenging task. Recently proposed attention mechanisms allow DOA estimators to focus on the sources-of-interest and disregard interference and noise, i.e., they are signal-aware. The attention is typically obtained by a deep neural network (DNN) from a short-time Fourier transform (STFT) based representation of a single microphone signal. Subsequently, attention has been applied as binary or ratio weighting to STFT-based microphone signal representations to reduce the impact of frequency bins dominated by noise, interference, or reverberation. The impact…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
