C-SL: Contrastive Sound Localization with Inertial-Acoustic Sensors
Majid Mirbagheri, Bardia Doosti

TL;DR
C-SL introduces a self-supervised contrastive learning approach for sound localization using inertial-acoustic sensors, eliminating the need for calibration and enabling personalized augmented hearing.
Contribution
It presents a novel contrastive learning method for DOA estimation that is calibration-free and adaptable to arbitrary sensor array geometries.
Findings
Performs well across various conditions in quantitative evaluations.
Operates in linear time with respect to input size.
Does not require prior knowledge of array geometry or source locations.
Abstract
Human brain employs perceptual information about the head and eye movements to update the spatial relationship between the individual and the surrounding environment. Based on this cognitive process known as spatial updating, we introduce contrastive sound localization (C-SL) with mobile inertial-acoustic sensor arrays of arbitrary geometry. C-SL uses unlabeled multi-channel audio recordings and inertial measurement unit (IMU) readings collected during free rotational movements of the array to learn mappings from acoustical measurements to an array-centered direction-of-arrival (DOA) in a self-supervised manner. Contrary to conventional DOA estimation methods that require the knowledge of either the array geometry or source locations in the calibration stage, C-SL is agnostic to both, and can be trained on data collected in minimally constrained settings. To achieve this capability, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Indoor and Outdoor Localization Technologies · Underwater Acoustics Research
