Rectified binaural ratio: A complex T-distributed feature for robust sound localization
Antoine Deleforge (PANAMA), Florence Forbes (MISTIS)

TL;DR
This paper introduces the rectified binaural ratio, a new complex t-distributed feature for robust sound localization, providing a statistically sound method to improve accuracy under noisy conditions.
Contribution
The paper proposes a novel feature based on a complex t-distribution for binaural sound localization, enhancing robustness in noisy environments.
Findings
The rectified binaural ratio follows a complex t-distribution under Gaussian noise.
The proposed methods outperform existing techniques in heavily noisy conditions.
Experiments confirm improved robustness on simulated and speech signals.
Abstract
Most existing methods in binaural sound source localization rely on some kind of aggregation of phase-and level-difference cues in the time-frequency plane. While different ag-gregation schemes exist, they are often heuristic and suffer in adverse noise conditions. In this paper, we introduce the rectified binaural ratio as a new feature for sound source local-ization. We show that for Gaussian-process point source signals corrupted by stationary Gaussian noise, this ratio follows a complex t-distribution with explicit parameters. This new formulation provides a principled and statistically sound way to aggregate binaural features in the presence of noise. We subsequently derive two simple and efficient methods for robust relative transfer function and time-delay estimation. Experiments on heavily corrupted simulated and speech signals demonstrate the robustness of the proposed scheme.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
