Towards More Realistic Human-Robot Conversation: A Seq2Seq-based Body Gesture Interaction System
Minjie Hua, Fuyuan Shi, Yibing Nan, Kai Wang, Hao Chen, and Shiguo, Lian

TL;DR
This paper introduces a seq2seq-based system enabling robots to produce realistic upper-body gestures during conversations, enhancing human-robot interaction by synthesizing 3D body movements from video data.
Contribution
The novel system integrates listening and speaking models based on seq2seq architecture to generate body gestures, advancing realistic robot communication.
Findings
Models achieve low MSE and high cosine similarity scores.
System improves conversational interaction in virtual and physical robots.
Effective gesture synthesis from video data demonstrated.
Abstract
This paper presents a novel system that enables intelligent robots to exhibit realistic body gestures while communicating with humans. The proposed system consists of a listening model and a speaking model used in corresponding conversational phases. Both models are adapted from the sequence-to-sequence (seq2seq) architecture to synthesize body gestures represented by the movements of twelve upper-body keypoints. All the extracted 2D keypoints are firstly 3D-transformed, then rotated and normalized to discard irrelevant information. Substantial videos of human conversations from Youtube are collected and preprocessed to train the listening and speaking models separately, after which the two models are evaluated using metrics of mean squared error (MSE) and cosine similarity on the test dataset. The tuned system is implemented to drive a virtual avatar as well as Pepper, a physical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Multimodal Machine Learning Applications
