Estimation of the Direct-Path Relative Transfer Function for Supervised Sound-Source Localization
Xiaofei Li, Laurent Girin, Radu Horaud, Sharon Gannot

TL;DR
This paper introduces a novel method for estimating the direct-path relative transfer function (DP-RTF) from noisy and reverberant microphone signals, improving sound-source localization accuracy in challenging environments.
Contribution
The paper presents a new approach to estimate DP-RTF using spectral densities and an inter-frame subtraction, enhancing localization robustness in noisy, reverberant conditions.
Findings
Outperforms existing localization methods in adverse acoustic environments
Effective noise reduction through spectral subtraction improves DP-RTF estimation
Method validated on both simulated and real-world data
Abstract
This paper addresses the problem of binaural localization of a single speech source in noisy and reverberant environments. For a given binaural microphone setup, the binaural response corresponding to the direct-path propagation of a single source is a function of the source direction. In practice, this response is contaminated by noise and reverberations. The direct-path relative transfer function (DP-RTF) is defined as the ratio between the direct-path acoustic transfer function of the two channels. We propose a method to estimate the DP-RTF from the noisy and reverberant microphone signals in the short-time Fourier transform domain. First, the convolutive transfer function approximation is adopted to accurately represent the impulse response of the sensors in the STFT domain. Second, the DP-RTF is estimated by using the auto- and cross-power spectral densities at each frequency and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
