Audio-Driven Reinforcement Learning for Head-Orientation in Naturalistic Environments
Wessel Ledder, Yuzhen Qin, Kiki van der Heijden

TL;DR
This paper introduces an audio-driven deep reinforcement learning framework for head-orientation in naturalistic environments, demonstrating high performance in anechoic conditions and analyzing generalization across reverberant settings.
Contribution
It presents a novel DRL approach for head-orientation control using stereo speech, highlighting its performance and generalization capabilities in reverberant environments.
Findings
High accuracy in anechoic environments
Performance drops with reverberation but remains better than baseline
Generalization varies depending on training environment
Abstract
Although deep reinforcement learning (DRL) approaches in audio signal processing have seen substantial progress in recent years, audio-driven DRL for tasks such as navigation, gaze control and head-orientation control in the context of human-robot interaction have received little attention. Here, we propose an audio-driven DRL framework in which we utilise deep Q-learning to develop an autonomous agent that orients towards a talker in the acoustic environment based on stereo speech recordings. Our results show that the agent learned to perform the task at a near perfect level when trained on speech segments in anechoic environments (that is, without reverberation). The presence of reverberation in naturalistic acoustic environments affected the agent's performance, although the agent still substantially outperformed a baseline, randomly acting agent. Finally, we quantified the degree of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTactile and Sensory Interactions · Multisensory perception and integration · Color perception and design
MethodsQ-Learning
