iCub! Do you recognize what I am doing?: multimodal human action recognition on multisensory-enabled iCub robot
Kas Kniesmeijer, Murat Kirtay

TL;DR
This paper presents a multimodal action recognition system for the iCub robot that combines color and depth data to improve accuracy in human-robot interaction scenarios.
Contribution
It introduces an ensemble learning approach that leverages multiple sensory modalities on the iCub robot for enhanced human action recognition.
Findings
Multimodal ensemble learning improves recognition accuracy.
Combining color and depth sensors yields better performance than single modalities.
Models can be deployed on iCub for social and contextual interaction tasks.
Abstract
This study uses multisensory data (i.e., color and depth) to recognize human actions in the context of multimodal human-robot interaction. Here we employed the iCub robot to observe the predefined actions of the human partners by using four different tools on 20 objects. We show that the proposed multimodal ensemble learning leverages complementary characteristics of three color cameras and one depth sensor that improves, in most cases, recognition accuracy compared to the models trained with a single modality. The results indicate that the proposed models can be deployed on the iCub robot that requires multimodal action recognition, including social tasks such as partner-specific adaptation, and contextual behavior understanding, to mention a few.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSocial Robot Interaction and HRI · Human Pose and Action Recognition · Multimodal Machine Learning Applications
