iCub! Do you recognize what I am doing?: multimodal human action   recognition on multisensory-enabled iCub robot

Kas Kniesmeijer; Murat Kirtay

arXiv:2212.08859·cs.RO·December 20, 2022

iCub! Do you recognize what I am doing?: multimodal human action recognition on multisensory-enabled iCub robot

Kas Kniesmeijer, Murat Kirtay

PDF

Open Access

TL;DR

This paper presents a multimodal action recognition system for the iCub robot that combines color and depth data to improve accuracy in human-robot interaction scenarios.

Contribution

It introduces an ensemble learning approach that leverages multiple sensory modalities on the iCub robot for enhanced human action recognition.

Findings

01

Multimodal ensemble learning improves recognition accuracy.

02

Combining color and depth sensors yields better performance than single modalities.

03

Models can be deployed on iCub for social and contextual interaction tasks.

Abstract

This study uses multisensory data (i.e., color and depth) to recognize human actions in the context of multimodal human-robot interaction. Here we employed the iCub robot to observe the predefined actions of the human partners by using four different tools on 20 objects. We show that the proposed multimodal ensemble learning leverages complementary characteristics of three color cameras and one depth sensor that improves, in most cases, recognition accuracy compared to the models trained with a single modality. The results indicate that the proposed models can be deployed on the iCub robot that requires multimodal action recognition, including social tasks such as partner-specific adaptation, and contextual behavior understanding, to mention a few.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSocial Robot Interaction and HRI · Human Pose and Action Recognition · Multimodal Machine Learning Applications