Safe Multimodal Communication in Human-Robot Collaboration

Davide Ferrari; Andrea Pupa; Alberto Signoretti; Cristian Secchi

arXiv:2308.03690·cs.RO·September 12, 2024

Safe Multimodal Communication in Human-Robot Collaboration

Davide Ferrari, Andrea Pupa, Alberto Signoretti, Cristian Secchi

PDF

Open Access

TL;DR

This paper presents a framework for safe, multimodal human-robot communication using voice and gesture fusion, ensuring safety compliance and improved collaboration efficiency in industrial settings.

Contribution

It introduces a novel multimodal communication framework that combines voice and gesture inputs with safety regulation adherence for human-robot collaboration.

Findings

01

Multimodal communication improves information extraction for robot tasks.

02

The safety layer allows robots to adjust speed for operator safety.

03

Experimental validation shows enhanced collaboration efficiency.

Abstract

The new industrial settings are characterized by the presence of human and robots that work in close proximity, cooperating in performing the required job. Such a collaboration, however, requires to pay attention to many aspects. Firstly, it is crucial to enable a communication between this two actors that is natural and efficient. Secondly, the robot behavior must always be compliant with the safety regulations, ensuring always a safe collaboration. In this paper, we propose a framework that enables multi-channel communication between humans and robots by leveraging multimodal fusion of voice and gesture commands while always respecting safety regulations. The framework is validated through a comparative experiment, demonstrating that, thanks to multimodal communication, the robot can extract valuable information for performing the required task and additionally, with the safety layer,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Social Robot Interaction and HRI · Robotics and Automated Systems

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings