TL;DR
UNICA is a unified neural framework that generates controllable 3D avatars from keyboard inputs, integrating motion, appearance, and rendering without manual physical simulation, enabling high-fidelity and dynamic avatar creation.
Contribution
It introduces UNICA, the first skeleton-free neural model that unifies avatar control, motion, and rendering into a single framework using diffusion and point transformer models.
Findings
Generates high-fidelity 3D avatars from simple keyboard controls.
Captures hair and clothing dynamics without manual physics.
Supports long autoregressive avatar generation.
Abstract
Controllable 3D human avatars have found widespread applications in 3D games, the metaverse, and AR/VR scenarios. The conventional approach to creating such a 3D avatar requires a lengthy, intricate pipeline encompassing appearance modeling, motion planning, rigging, and physical simulation. In this paper, we introduce UNICA (UNIfied neural Controllable Avatar), a skeleton-free generative model that unifies all avatar control components into a single neural framework. Given keyboard inputs akin to video game controls, UNICA generates the next frame of a 3D avatar's geometry through an action-conditioned diffusion model operating on 2D position maps. A point transformer then maps the resulting geometry to 3D Gaussian Splatting for high-fidelity free-view rendering. Our approach naturally captures hair and loose clothing dynamics without manually designed physical simulation, and supports…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
