Design and Analysis of Binaural Signal Matching with Arbitrary Microphone Arrays and Listener Head Rotations
Lior Madmoni, Zamir Ben-Hur, Jacob Donley, Vladimir Tourbabin, Boaz, Rafaely

TL;DR
This paper enhances binaural signal matching for arbitrary microphone arrays and head rotations, ensuring realistic virtual auditory experiences in VR devices through a detailed analysis, a new perceptually motivated extension, and extensive simulations and listening tests.
Contribution
It introduces a design framework and a MagLS-based extension to improve BSM accuracy, especially at high frequencies and during head rotations, for complex array configurations.
Findings
BSM-MagLS achieves high binaural signal quality in simulations.
The method effectively compensates for head rotations.
Listening tests confirm perceptual improvements.
Abstract
Binaural reproduction is rapidly becoming a topic of great interest in the research community, especially with the surge of new and popular devices, such as virtual reality headsets, smart glasses, and head-tracked headphones. In order to immerse the listener in a virtual or remote environment with such devices, it is essential to generate realistic and accurate binaural signals. This is challenging, especially since the microphone arrays mounted on these devices are typically composed of an arbitrarily-arranged small number of microphones, which impedes the use of standard audio formats like Ambisonics, and provides limited spatial resolution. The binaural signal matching (BSM) method was developed recently to overcome these challenges. While it produced binaural signals with low error using relatively simple arrays, its performance degraded significantly when head rotation was…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis
