Starting engagement detection towards a companion robot using multimodal   features

Dominique Vaufreydaz (INRIA Grenoble Rh\^one-Alpes / LIG Laboratoire; d'Informatique de Grenoble); Wafa Johal (LIG); Claudine Combe (INRIA Grenoble; Rh\^one-Alpes / LIG Laboratoire d'Informatique de Grenoble)

arXiv:1503.03732·cs.RO·March 13, 2015

Starting engagement detection towards a companion robot using multimodal features

Dominique Vaufreydaz (INRIA Grenoble Rh\^one-Alpes / LIG Laboratoire, d'Informatique de Grenoble), Wafa Johal (LIG), Claudine Combe (INRIA Grenoble, Rh\^one-Alpes / LIG Laboratoire d'Informatique de Grenoble)

PDF

TL;DR

This paper presents a multimodal feature-based method inspired by social sciences for detecting the intention to start interaction with a robot, demonstrating improved accuracy over traditional spatial features in real-world conditions.

Contribution

It introduces a novel multimodal feature set for engagement detection, validated on spontaneous interaction data, and highlights the importance of feature selection and space reduction challenges.

Findings

01

Multimodal features outperform spatial features in detection accuracy.

02

Seven features are sufficient for effective engagement detection.

03

Space reduction of features remains a complex challenge.

Abstract

Recognition of intentions is a subconscious cognitive process vital to human communication. This skill enables anticipation and increases the quality of interactions between humans. Within the context of engagement, non-verbal signals are used to communicate the intention of starting the interaction with a partner. In this paper, we investigated methods to detect these signals in order to allow a robot to know when it is about to be addressed. Originality of our approach resides in taking inspiration from social and cognitive sciences to perform our perception task. We investigate meaningful features, i.e. human readable features, and elicit which of these are important for recognizing someone's intention of starting an interaction. Classically, spatial information like the human position and speed, the human-robot distance are used to detect the engagement. Our approach integrates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings