GoMAvatar: Efficient Animatable Human Modeling from Monocular Video   Using Gaussians-on-Mesh

Jing Wen; Xiaoming Zhao; Zhongzheng Ren; Alexander G. Schwing,; Shenlong Wang

arXiv:2404.07991·cs.CV·April 12, 2024·2 cites

GoMAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh

Jing Wen, Xiaoming Zhao, Zhongzheng Ren, Alexander G. Schwing,, Shenlong Wang

PDF

Open Access 1 Repo

TL;DR

GoMAvatar is a real-time, memory-efficient method for creating animatable 3D human avatars from monocular videos, combining Gaussian splatting with mesh deformation for high-quality rendering and pose re-articulation.

Contribution

It introduces the Gaussians-on-Mesh hybrid model, enabling high-quality, real-time human avatar creation from a single video with improved efficiency and compatibility with graphics pipelines.

Findings

01

Achieves 43 FPS rendering speed.

02

Memory usage is only 3.63 MB per subject.

03

Outperforms existing methods in quality and efficiency.

Abstract

We introduce GoMAvatar, a novel approach for real-time, memory-efficient, high-quality animatable human modeling. GoMAvatar takes as input a single monocular video to create a digital avatar capable of re-articulation in new poses and real-time rendering from novel viewpoints, while seamlessly integrating with rasterization-based graphics pipelines. Central to our method is the Gaussians-on-Mesh representation, a hybrid 3D model combining rendering quality and speed of Gaussian splatting with geometry modeling and compatibility of deformable meshes. We assess GoMAvatar on ZJU-MoCap data and various YouTube videos. GoMAvatar matches or surpasses current monocular human modeling algorithms in rendering quality and significantly outperforms them in computational efficiency (43 FPS) while being memory-efficient (3.63 MB per subject).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wenj/GoMAvatar
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · 3D Shape Modeling and Analysis · Human Motion and Animation

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings