FFAvatar: Few-Shot, Feed-Forward, and Generalizable Avatar Reconstruction

Thuan Hoang Nguyen; Jiahao Luo; Yinyu Nie; Hao Li; Gordon Guocheng Qian; Jian Wang

arXiv:2605.15320·cs.GR·May 18, 2026

FFAvatar: Few-Shot, Feed-Forward, and Generalizable Avatar Reconstruction

Thuan Hoang Nguyen, Jiahao Luo, Yinyu Nie, Hao Li, Gordon Guocheng Qian, Jian Wang

PDF

TL;DR

FFAvatar is a fast, generalizable framework for high-quality 3D avatar reconstruction from few images, eliminating the need for hours of optimization or extensive preprocessing.

Contribution

It introduces a feed-forward, multi-stage training approach that achieves broad generalization and high-fidelity avatar reconstruction from minimal input images.

Findings

01

Outperforms state-of-the-art LAM with 5.5 PSNR gain on NeRSemble benchmark.

02

Reconstructs avatars in 2 seconds without personalization and 10 seconds with personalization.

03

Supports 49 FPS animation on a single GPU.

Abstract

Avatar reconstruction has traditionally relied on per-subject optimization that requires hours of computation or on expensive preprocessing that limits scalability. We introduce FFAvatar, a generalizable feed-forward framework that reconstructs high-quality, animatable 3D Gaussian head avatars from few-shot unposed portrait images in seconds. FFAvatar fuses information from multiple source images into a unified canonical Gaussian representation through Multi-View Query-Former, which is animated via FLAME parameters predicted end-to-end directly from pixels, eliminating the overhead of offline FLAME extraction. We further propose a three-stage training curriculum that achieves both broad generalization and high-fidelity reconstruction: (i) scalable pretraining on extensive monocular video data with over 1M identities to learn strong generalizable priors; (ii) multi-view fine-tuning on a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.