Towards a Generalization of Relative Transfer Functions to More Than One Source
Antoine Deleforge, Sharon Gannot, Walter Kellermann

TL;DR
This paper introduces a novel transform for multichannel spectrograms that generalizes relative transfer functions to multiple sources, enabling localization of simultaneous sounds without source separation.
Contribution
It proposes a new transform that extends RTFs to multiple sources using multiframe spectrograms, overcoming the limitations of single observations.
Findings
Enables localization of multiple sources simultaneously
Does not require source separation
Works with short spectro-temporal windows
Abstract
We propose a natural way to generalize relative transfer functions (RTFs) to more than one source. We first prove that such a generalization is not possible using a single multichannel spectro-temporal observation, regardless of the number of microphones. We then introduce a new transform for multichannel multi-frame spectrograms, i.e., containing several channels and time frames in each time-frequency bin. This transform allows a natural generalization which satisfies the three key properties of RTFs, namely, they can be directly estimated from observed signals, they capture spatial properties of the sources and they do not depend on emitted signals. Through simulated experiments, we show how this new method can localize multiple simultaneously active sound sources using short spectro-temporal windows, without relying on source separation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
