GALA: Generating Animatable Layered Assets from a Single Scan

Taeksoo Kim; Byungjun Kim; Shunsuke Saito; Hanbyul Joo

arXiv:2401.12979·cs.CV·January 24, 2024·1 cites

GALA: Generating Animatable Layered Assets from a Single Scan

Taeksoo Kim, Byungjun Kim, Shunsuke Saito, Hanbyul Joo

PDF

Open Access

TL;DR

GALA is a framework that decomposes a single 3D human mesh into multi-layered assets using a pretrained 2D diffusion model, enabling realistic avatar creation and pose reanimation.

Contribution

It introduces a novel pose-guided SDS loss for high-fidelity 3D geometry and texture synthesis from a single scan, improving decomposition and normalization of human meshes.

Findings

01

Effective decomposition into layered assets demonstrated

02

Supports pose normalization and reanimation

03

Outperforms existing methods in quality and flexibility

Abstract

We present GALA, a framework that takes as input a single-layer clothed 3D human mesh and decomposes it into complete multi-layered 3D assets. The outputs can then be combined with other assets to create novel clothed human avatars with any pose. Existing reconstruction approaches often treat clothed humans as a single-layer of geometry and overlook the inherent compositionality of humans with hairstyles, clothing, and accessories, thereby limiting the utility of the meshes for downstream applications. Decomposing a single-layer mesh into separate layers is a challenging task because it requires the synthesis of plausible geometry and texture for the severely occluded regions. Moreover, even with successful decomposition, meshes are not normalized in terms of poses and body shapes, failing coherent composition with novel identities and poses. To address these challenges, we propose to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Human Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis

MethodsInpainting · Diffusion · Global-and-Local attention