Pix2NPHM: Learning to Regress NPHM Reconstructions From a Single Image

Simon Giebenhain; Tobias Kirschstein; Liam Schoneveld; Davide Davoli; Zhe Chen; Matthias Nie{\ss}ner

arXiv:2512.17773·cs.CV·December 22, 2025

Pix2NPHM: Learning to Regress NPHM Reconstructions From a Single Image

Simon Giebenhain, Tobias Kirschstein, Liam Schoneveld, Davide Davoli, Zhe Chen, Matthias Nie{\ss}ner

PDF

Open Access

TL;DR

Pix2NPHM introduces a vision transformer-based approach to directly regress neural parametric head model parameters from a single image, enabling high-fidelity, real-time 3D face reconstructions with improved geometric accuracy.

Contribution

The paper presents Pix2NPHM, a novel ViT-based method for direct NPHM parameter regression from a single image, enhancing facial reconstruction quality and speed over prior mesh-based models.

Findings

01

Reconstructs more recognizable facial geometry and expressions.

02

Achieves high-quality 3D face reconstructions at interactive frame rates.

03

Demonstrates scalability and improved fidelity on in-the-wild data.

Abstract

Neural Parametric Head Models (NPHMs) are a recent advancement over mesh-based 3d morphable models (3DMMs) to facilitate high-fidelity geometric detail. However, fitting NPHMs to visual inputs is notoriously challenging due to the expressive nature of their underlying latent space. To this end, we propose Pix2NPHM, a vision transformer (ViT) network that directly regresses NPHM parameters, given a single image as input. Compared to existing approaches, the neural parametric space allows our method to reconstruct more recognizable facial geometry and accurate facial expressions. For broad generalization, we exploit domain-specific ViTs as backbones, which are pretrained on geometric prediction tasks. We train Pix2NPHM on a mixture of 3D data, including a total of over 100K NPHM registrations that enable direct supervision in SDF space, and large-scale 2D video datasets, for which normal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis