Direction of Arrival with One Microphone, a few LEGOs, and Non-Negative   Matrix Factorization

Dalia El Badawy; Ivan Dokmani\'c

arXiv:1801.03740·eess.AS·August 29, 2018

Direction of Arrival with One Microphone, a few LEGOs, and Non-Negative Matrix Factorization

Dalia El Badawy, Ivan Dokmani\'c

PDF

1 Repo

TL;DR

This paper introduces a novel monaural sound localization method using a single microphone with simple LEGO scatterers, leveraging learned non-negative matrix factorization to accurately identify speaker directions without complex structures.

Contribution

It presents a new approach for monaural source localization using basic scatterers and NMF, enabling accurate speech localization without speaker-specific training.

Findings

01

Accurate localization of speech sources with a single microphone and simple LEGO scatterers.

02

The method does not require learning speaker-specific dictionaries.

03

Effective multi-source localization discussed with identified limitations.

Abstract

Conventional approaches to sound source localization require at least two microphones. It is known, however, that people with unilateral hearing loss can also localize sounds. Monaural localization is possible thanks to the scattering by the head, though it hinges on learning the spectra of the various sources. We take inspiration from this human ability to propose algorithms for accurate sound source localization using a single microphone embedded in an arbitrary scattering structure. The structure modifies the frequency response of the microphone in a direction-dependent way giving each direction a signature. While knowing those signatures is sufficient to localize sources of white noise, localizing speech is much more challenging: it is an ill-posed inverse problem which we regularize by prior knowledge in the form of learned non-negative dictionaries. We demonstrate a monaural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

swing-research/scatsense
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.