Large Body Language Models
Saif Punjwani, Larry Heck

TL;DR
This paper introduces Large Body Language Models (LBLMs) and a novel architecture, LBLM-AVA, which combines Transformer-XL and diffusion models to generate realistic, context-aware gestures for virtual agents using multimodal inputs.
Contribution
The paper presents LBLM-AVA, a new architecture integrating multimodal inputs with advanced components for improved gesture generation in human-computer interaction.
Findings
Achieves 30% reduction in Fréchet Gesture Distance
Improves Fréchet Inception Distance by 25%
State-of-the-art performance in gesture realism
Abstract
As virtual agents become increasingly prevalent in human-computer interaction, generating realistic and contextually appropriate gestures in real-time remains a significant challenge. While neural rendering techniques have made substantial progress with static scripts, their applicability to human-computer interactions remains limited. To address this, we introduce Large Body Language Models (LBLMs) and present LBLM-AVA, a novel LBLM architecture that combines a Transformer-XL large language model with a parallelized diffusion model to generate human-like gestures from multimodal inputs (text, audio, and video). LBLM-AVA incorporates several key components enhancing its gesture generation capabilities, such as multimodal-to-pose embeddings, enhanced sequence-to-sequence mapping with redefined attention mechanisms, a temporal smoothing module for gesture sequence coherence, and an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSocial Robot Interaction and HRI
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Attention Is All You Need · Dropout · Adaptive Input Representations · Dense Connections · Layer Normalization · Residual Connection · Cosine Annealing · Diffusion · Adaptive Softmax
