TL;DR
This paper introduces a novel monaural sound localization method using a single microphone with simple LEGO scatterers, leveraging learned non-negative matrix factorization to accurately identify speaker directions without complex structures.
Contribution
It presents a new approach for monaural source localization using basic scatterers and NMF, enabling accurate speech localization without speaker-specific training.
Findings
Accurate localization of speech sources with a single microphone and simple LEGO scatterers.
The method does not require learning speaker-specific dictionaries.
Effective multi-source localization discussed with identified limitations.
Abstract
Conventional approaches to sound source localization require at least two microphones. It is known, however, that people with unilateral hearing loss can also localize sounds. Monaural localization is possible thanks to the scattering by the head, though it hinges on learning the spectra of the various sources. We take inspiration from this human ability to propose algorithms for accurate sound source localization using a single microphone embedded in an arbitrary scattering structure. The structure modifies the frequency response of the microphone in a direction-dependent way giving each direction a signature. While knowing those signatures is sufficient to localize sources of white noise, localizing speech is much more challenging: it is an ill-posed inverse problem which we regularize by prior knowledge in the form of learned non-negative dictionaries. We demonstrate a monaural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
