Imitation of human motion achieves natural head movements for humanoid robots in an active-speaker detection task
Bosong Ding, Murat Kirtay, Giacomo Spigler

TL;DR
This paper presents a generative AI approach for producing natural human-like head movements in a humanoid robot, enhancing active-speaker detection in group conversations.
Contribution
It introduces a novel AI pipeline for realistic head movement imitation in robots, applied to real-time active-speaker tracking during social interactions.
Findings
Robot successfully imitates human head movements naturally.
System effectively tracks speakers in real-time group conversations.
Code and data are publicly available for further research.
Abstract
Head movements are crucial for social human-human interaction. They can transmit important cues (e.g., joint attention, speaker detection) that cannot be achieved with verbal interaction alone. This advantage also holds for human-robot interaction. Even though modeling human motions through generative AI models has become an active research area within robotics in recent years, the use of these methods for producing head movements in human-robot interaction remains underexplored. In this work, we employed a generative AI pipeline to produce human-like head movements for a Nao humanoid robot. In addition, we tested the system on a real-time active-speaker tracking task in a group conversation setting. Overall, the results show that the Nao robot successfully imitates human head movements in a natural manner while actively tracking the speakers during the conversation. Code and data from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems · Social Robot Interaction and HRI · Robotic Locomotion and Control
