Expressive Gaussian Human Avatars from Monocular RGB Video

Hezhen Hu; Zhiwen Fan; Tianhao Wu; Yihan Xi; Seoyoung Lee; and Georgios Pavlakos; Zhangyang Wang

arXiv:2407.03204·cs.CV·July 4, 2024·1 cites

Expressive Gaussian Human Avatars from Monocular RGB Video

Hezhen Hu, Zhiwen Fan, Tianhao Wu, Yihan Xi, Seoyoung Lee, and Georgios Pavlakos, Zhangyang Wang

PDF

Open Access

TL;DR

This paper introduces EVA, a novel framework for creating highly expressive 3D human avatars from monocular RGB videos, emphasizing fine-grained facial and hand details through improved alignment, adaptive density control, and confidence-guided learning.

Contribution

EVA advances avatar expressiveness by integrating a plug-and-play alignment module, adaptive density control, and a confidence prediction mechanism for detailed 3D Gaussian modeling.

Findings

01

Outperforms existing methods in capturing fine-grained facial and hand details.

02

Improves alignment accuracy of SMPL-X models in wild video conditions.

03

Demonstrates superior qualitative and quantitative results on benchmark datasets.

Abstract

Nuanced expressiveness, particularly through fine-grained hand and facial expressions, is pivotal for enhancing the realism and vitality of digital human representations. In this work, we focus on investigating the expressiveness of human avatars when learned from monocular RGB video; a setting that introduces new challenges in capturing and animating fine-grained details. To this end, we introduce EVA, a drivable human model that meticulously sculpts fine details based on 3D Gaussians and SMPL-X, an expressive parametric human model. Focused on enhancing expressiveness, our work makes three key contributions. First, we highlight the critical importance of aligning the SMPL-X model with RGB frames for effective avatar learning. Recognizing the limitations of current SMPL-X prediction methods for in-the-wild videos, we introduce a plug-and-play module that significantly ameliorates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Social Robot Interaction and HRI

MethodsFocus