Initiation of Interaction Detection Framework using a Nonverbal Cue for Human-Robot Interaction

Guhnoo Yun,Juhan Yoo; Kijung Kim; Dong Hwan Kim

arXiv:2605.10087·cs.CV·May 12, 2026

Initiation of Interaction Detection Framework using a Nonverbal Cue for Human-Robot Interaction

Guhnoo Yun,Juhan Yoo, Kijung Kim, Dong Hwan Kim

PDF

TL;DR

This paper presents a nonverbal cue-based framework for detecting initiation of interaction in human-robot interaction, utilizing audio-visual sensor fusion and a state transition model in a ROS environment.

Contribution

It introduces a novel IoI detection framework that does not rely on keywords, integrating audio-visual cues and a state model for improved human-robot interaction.

Findings

01

The framework successfully detects IoI through speech and gaze cues.

02

Experimental verification on a mobile robot demonstrates effective performance.

03

All components are integrated within the ROS environment.

Abstract

This paper describes an initiation of interaction(IoI) detection framework without keywords for human-robot interaction(HRI) based on audio and vision sensor fusion in a domestic environment. In the proposed framework, the robot has its own audio and vision sensors, and can employ external vision sensor for stable human detection and tracking. When the user starts to speak while looking at the robot, the robot can localize his or her position by its sound source localization together with human tracking information. Then the robot can detect the IoI if it perceives the face of the speaker faces the robot. In case that the user does not speak directly, the robot can also detect the IoI if he or she looks at the robot for more than predefined periods of time. A state transition model for the proposed IoI detection framework is designed and verified by experiments with a mobile robot. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.