MRecGen: Multimodal Appropriate Reaction Generator

Jiaqi Xu; Cheng Luo; Weicheng Xie; Linlin Shen; Xiaofeng Liu; Lu Liu,; Hatice Gunes; Siyang Song

arXiv:2307.02609·cs.CV·July 7, 2023

MRecGen: Multimodal Appropriate Reaction Generator

Jiaqi Xu, Cheng Luo, Weicheng Xie, Linlin Shen, Xiaofeng Liu, Lu Liu,, Hatice Gunes, Siyang Song

PDF

Open Access

TL;DR

This paper introduces MRecGen, a novel multimodal framework that generates synchronized verbal and non-verbal human reactions in response to user behavior, enhancing human-computer interaction realism.

Contribution

It presents the first multimodal reaction generation framework capable of producing synchronized text, audio, and video responses for human-like interactions.

Findings

01

Generates realistic, synchronized multimodal reactions

02

Applicable to virtual agents and robots

03

Demonstrates improved interaction naturalness

Abstract

Verbal and non-verbal human reaction generation is a challenging task, as different reactions could be appropriate for responding to the same behaviour. This paper proposes the first multiple and multimodal (verbal and nonverbal) appropriate human reaction generation framework that can generate appropriate and realistic human-style reactions (displayed in the form of synchronised text, audio and video streams) in response to an input user behaviour. This novel technique can be applied to various human-computer interaction scenarios by generating appropriate virtual agent/robot behaviours. Our demo is available at \url{https://github.com/SSYSteve/MRecGen}.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Social Robot Interaction and HRI · AI in Service Interactions