HumanRAM: Feed-forward Human Reconstruction and Animation Model using Transformers

Zhiyuan Yu; Zhe Li; Hujun Bao; Can Yang; Xiaowei Zhou

arXiv:2506.03118·cs.GR·June 4, 2025

HumanRAM: Feed-forward Human Reconstruction and Animation Model using Transformers

Zhiyuan Yu, Zhe Li, Hujun Bao, Can Yang, Xiaowei Zhou

PDF

TL;DR

HumanRAM is a novel feed-forward transformer-based model that enables real-time, high-quality 3D human reconstruction and animation from monocular or sparse images, surpassing previous methods in accuracy and generalization.

Contribution

It introduces a unified framework integrating human reconstruction and animation using explicit pose conditions and transformers, enabling efficient and generalizable performance.

Findings

01

Outperforms previous methods in reconstruction accuracy

02

Achieves high-fidelity pose-controlled animation

03

Demonstrates strong generalization on real-world datasets

Abstract

3D human reconstruction and animation are long-standing topics in computer graphics and vision. However, existing methods typically rely on sophisticated dense-view capture and/or time-consuming per-subject optimization procedures. To address these limitations, we propose HumanRAM, a novel feed-forward approach for generalizable human reconstruction and animation from monocular or sparse human images. Our approach integrates human reconstruction and animation into a unified framework by introducing explicit pose conditions, parameterized by a shared SMPL-X neural texture, into transformer-based large reconstruction models (LRM). Given monocular or sparse input images with associated camera parameters and SMPL-X poses, our model employs scalable transformers and a DPT-based decoder to synthesize realistic human renderings under novel viewpoints and novel poses. By leveraging the explicit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.