# Multimodal Uncertainty Reduction for Intention Recognition in   Human-Robot Interaction

**Authors:** Susanne Trick, Dorothea Koert, Jan Peters, Constantin Rothkopf

arXiv: 1907.02426 · 2019-07-05

## TL;DR

This paper introduces a multimodal intention recognition method for human-robot interaction that reduces uncertainty and improves accuracy by fusing speech, gestures, gaze, and scene data using Bayesian methods.

## Contribution

A novel multimodal intention recognition approach focusing on uncertainty reduction through classifier fusion using Bayesian methods.

## Key findings

- Fused classifiers outperform individual modalities in accuracy.
- The approach increases robustness against modality failure.
- Uncertainty about intentions is significantly reduced.

## Abstract

Assistive robots can potentially improve the quality of life and personal independence of elderly people by supporting everyday life activities. To guarantee a safe and intuitive interaction between human and robot, human intentions need to be recognized automatically. As humans communicate their intentions multimodally, the use of multiple modalities for intention recognition may not just increase the robustness against failure of individual modalities but especially reduce the uncertainty about the intention to be predicted. This is desirable as particularly in direct interaction between robots and potentially vulnerable humans a minimal uncertainty about the situation as well as knowledge about this actual uncertainty is necessary. Thus, in contrast to existing methods, in this work a new approach for multimodal intention recognition is introduced that focuses on uncertainty reduction through classifier fusion. For the four considered modalities speech, gestures, gaze directions and scene objects individual intention classifiers are trained, all of which output a probability distribution over all possible intentions. By combining these output distributions using the Bayesian method Independent Opinion Pool the uncertainty about the intention to be recognized can be decreased. The approach is evaluated in a collaborative human-robot interaction task with a 7-DoF robot arm. The results show that fused classifiers which combine multiple modalities outperform the respective individual base classifiers with respect to increased accuracy, robustness, and reduced uncertainty.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.02426/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1907.02426/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/1907.02426/full.md

---
Source: https://tomesphere.com/paper/1907.02426