# Speech-Gesture Mapping and Engagement Evaluation in Human Robot   Interaction

**Authors:** Bishal Ghosh, Abhinav Dhall, Ekta Singla

arXiv: 1812.03484 · 2024-10-01

## TL;DR

This paper presents an end-to-end system that enhances robot-human interaction by mapping prominent TED speaker gestures to speech, modulating speech based on audience attention, and evaluating engagement through social surveys.

## Contribution

It introduces a novel gesture-to-speech mapping method using TED speaker data and implements a self-improving robot that adapts speech based on audience attention levels.

## Key findings

- Effective gesture-to-speech mapping achieved
- Robot's adaptive speech improved audience engagement
- Engagement evaluation validated the system's effectiveness

## Abstract

A robot needs contextual awareness, effective speech production and complementing non-verbal gestures for successful communication in society. In this paper, we present our end-to-end system that tries to enhance the effectiveness of non-verbal gestures. For achieving this, we identified prominently used gestures in performances by TED speakers and mapped them to their corresponding speech context and modulated speech based upon the attention of the listener. The proposed method utilized Convolutional Pose Machine [4] to detect the human gesture. Dominant gestures of TED speakers were used for learning the gesture-to-speech mapping. The speeches by them were used for training the model. We also evaluated the engagement of the robot with people by conducting a social survey. The effectiveness of the performance was monitored by the robot and it self-improvised its speech pattern on the basis of the attention level of the audience, which was calculated using visual feedback from the camera. The effectiveness of interaction as well as the decisions made during improvisation was further evaluated based on the head-pose detection and interaction survey.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.03484/full.md

## Figures

29 figures with captions in the complete paper: https://tomesphere.com/paper/1812.03484/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/1812.03484/full.md

---
Source: https://tomesphere.com/paper/1812.03484