B2F: End-to-End Body-to-Face Motion Generation with Style Reference
Bokyung Jang, Eunho Jung, and Yoonsang Lee

TL;DR
B2F is an end-to-end model that generates facial motions aligned with body movements and style references, enhancing virtual character realism and cohesion.
Contribution
It introduces a novel disentangled representation for style and content, using Gumbel-Softmax for diverse, style-reflective facial animation generation.
Findings
Generates expressive, synchronized facial animations with body movements.
Maintains style consistency and diversity through structured latent codes.
Generalizes well across different characters and styles.
Abstract
Human motion naturally integrates body movements and facial expressions, forming a unified perception. If a virtual character's facial expression does not align well with its body movements, it may weaken the perception of the character as a cohesive whole. Motivated by this, we propose B2F, a model that generates facial motions aligned with body movements. B2F takes a facial style reference as input, generating facial animations that reflect the provided style while maintaining consistency with the associated body motion. To achieve this, B2F learns a disentangled representation of content and style, using alignment and consistency-based objectives. We represent style using discrete latent codes learned via the Gumbel-Softmax trick, enabling diverse expression generation with a structured latent representation. B2F outputs facial motion in the FLAME format, making it compatible with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Human Motion and Animation
