Multi-Modal Human-Machine Communication for Instructing Robot Grasping   Tasks

P.C. McGuire; J. Fritsch; J. J. Steil; F. Roethling; G. A. Fink; S.; Wachsmuth; G. Sagerer; H. Ritter

arXiv:cs/0505064·cs.HC·November 17, 2016

Multi-Modal Human-Machine Communication for Instructing Robot Grasping Tasks

P.C. McGuire, J. Fritsch, J. J. Steil, F. Roethling, G. A. Fink, S., Wachsmuth, G. Sagerer, H. Ritter

PDF

TL;DR

This paper presents a hybrid multi-modal system enabling intuitive human-robot communication for instructing grasping tasks, integrating visual, speech, and gestural inputs to improve robot programming ease.

Contribution

It introduces a novel hybrid architecture combining statistical methods, neural networks, and finite state machines for multi-modal human-robot interaction in grasping tasks.

Findings

01

Successful integration of visual, speech, and gesture modalities

02

Enhanced robot understanding of human instructions

03

Improved efficiency in teaching grasping tasks

Abstract

A major challenge for the realization of intelligent robots is to supply them with cognitive abilities in order to allow ordinary users to program them easily and intuitively. One way of such programming is teaching work tasks by interactive demonstration. To make this effective and convenient for the user, the machine must be capable to establish a common focus of attention and be able to use and integrate spoken instructions, visual perceptions, and non-verbal clues like gestural commands. We report progress in building a hybrid architecture that combines statistical methods, neural networks, and finite state machines into an integrated system for instructing grasping tasks by man-machine interaction. The system combines the GRAVIS-robot for visual attention and gestural instruction with an intelligent interface for speech recognition and linguistic interpretation, and an modality…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.