Gaussian Process Models for HRTF based Sound-Source Localization and   Active-Learning

Yuancheng Luo; Dmitry N. Zotkin; Ramani Duraiswami

arXiv:1502.03163·cs.SD·February 12, 2015·2 cites

Gaussian Process Models for HRTF based Sound-Source Localization and Active-Learning

Yuancheng Luo, Dmitry N. Zotkin, Ramani Duraiswami

PDF

Open Access

TL;DR

This paper introduces Gaussian process-based models for sound-source localization using HRTFs, optimizing measurement sampling and enabling personalized HRTF inference through active learning, achieving high localization accuracy with minimal data.

Contribution

It develops a Gaussian process regression framework for SSL, incorporating active learning to efficiently infer individual HRTFs and improve localization accuracy.

Findings

01

High localization accuracy with only a small subset of HRTFs

02

Learned HRTFs are closer to intended directions than non-individualized ones

03

Active learning effectively updates SSL models online

Abstract

From a machine learning perspective, the human ability localize sounds can be modeled as a non-parametric and non-linear regression problem between binaural spectral features of sound received at the ears (input) and their sound-source directions (output). The input features can be summarized in terms of the individual's head-related transfer functions (HRTFs) which measure the spectral response between the listener's eardrum and an external point in $3$ D. Based on these viewpoints, two related problems are considered: how can one achieve an optimal sampling of measurements for training sound-source localization (SSL) models, and how can SSL models be used to infer the subject's HRTFs in listening tests. First, we develop a class of binaural SSL models based on Gaussian process regression and solve a \emph{forward selection} problem that finds a subset of input-output samples that best…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Music and Audio Processing · Speech and Audio Processing