Latent Acoustic Mapping for Direction of Arrival Estimation: A Self-Supervised Approach
Adrian S. Roman, Iran R. Roman, Juan P. Bello

TL;DR
This paper introduces Latent Acoustic Mapping (LAM), a self-supervised model that combines interpretability and adaptability for improved direction of arrival estimation in spatial audio, outperforming traditional and supervised methods.
Contribution
The paper presents a novel self-supervised framework, LAM, that produces high-resolution acoustic maps, adapts to diverse acoustic conditions, and enhances DoAE performance across various setups.
Findings
LAM achieves comparable or better localization accuracy than supervised methods.
LAM's acoustic maps improve supervised DoAE models when used as features.
The approach demonstrates robustness across different acoustic environments and microphone arrays.
Abstract
Acoustic mapping techniques have long been used in spatial audio processing for direction of arrival estimation (DoAE). Traditional beamforming methods for acoustic mapping, while interpretable, often rely on iterative solvers that can be computationally intensive and sensitive to acoustic variability. On the other hand, recent supervised deep learning approaches offer feedforward speed and robustness but require large labeled datasets and lack interpretability. Despite their strengths, both methods struggle to consistently generalize across diverse acoustic setups and array configurations, limiting their broader applicability. We introduce the Latent Acoustic Mapping (LAM) model, a self-supervised framework that bridges the interpretability of traditional methods with the adaptability and efficiency of deep learning methods. LAM generates high-resolution acoustic maps, adapts to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Direction-of-Arrival Estimation Techniques
