Estimating speaker direction on a humanoid robot with binaural acoustic   signals

Pranav Barot; Katja Mombaur; Ewen MacDonald

arXiv:2307.12129·cs.RO·July 25, 2023

Estimating speaker direction on a humanoid robot with binaural acoustic signals

Pranav Barot, Katja Mombaur, Ewen MacDonald

PDF

Open Access

TL;DR

This paper introduces an optimized binaural sound source localization method for humanoid robots to accurately estimate human talker direction in real-time interactions, validated with real data and latency analysis.

Contribution

It presents a novel parameter optimization approach for DOA estimation tailored for humanoid robots, enhancing real-time speech interaction capabilities.

Findings

01

Optimized DOA parameters improve localization accuracy.

02

Bayesian optimization outperforms brute force methods.

03

Latency considerations are addressed for real-time deployment.

Abstract

To achieve human-like behaviour during speech interactions, it is necessary for a humanoid robot to estimate the location of a human talker. Here, we present a method to optimize the parameters used for the direction of arrival (DOA) estimation, while also considering real-time applications for human-robot interaction scenarios. This method is applied to binaural sound source localization framework on a humanoid robotic head. Real data is collected and annotated for this work. Optimizations are performed via a brute force method and a Bayesian model based method, results are validated and discussed, and effects on latency for real-time use are also explored.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Structural Health Monitoring Techniques · Gait Recognition and Analysis