Generative Head-Mounted Camera Captures for Photorealistic Avatars

Shaojie Bai; Seunghyeon Seo; Yida Wang; Chenghui Li; Owen Wang; Te-Li Wang; Tianyang Ma; Jason Saragih; Shih-En Wei; Nojun Kwak; Hyung Jun Kim

arXiv:2507.05620·cs.CV·October 16, 2025

Generative Head-Mounted Camera Captures for Photorealistic Avatars

Shaojie Bai, Seunghyeon Seo, Yida Wang, Chenghui Li, Owen Wang, Te-Li Wang, Tianyang Ma, Jason Saragih, Shih-En Wei, Nojun Kwak, Hyung Jun Kim

PDF

Open Access

TL;DR

This paper introduces GenHMC, a generative model that creates photorealistic head-mounted camera images for avatars using unpaired data, improving realism, disentanglement, and generalization for VR/AR applications.

Contribution

The work presents a novel generative approach that leverages unpaired HMC data to produce high-quality avatar images, enabling better disentanglement and generalization to unseen identities.

Findings

01

GenHMC generates realistic HMC images from avatar states.

02

The method achieves state-of-the-art accuracy in face encoding tasks.

03

It reduces data collection costs by using unpaired captures.

Abstract

Enabling photorealistic avatar animations in virtual and augmented reality (VR/AR) has been challenging because of the difficulty of obtaining ground truth state of faces. It is physically impossible to obtain synchronized images from head-mounted cameras (HMC) sensing input, which has partial observations in infrared (IR), and an array of outside-in dome cameras, which have full observations that match avatars' appearance. Prior works relying on analysis-by-synthesis methods could generate accurate ground truth, but suffer from imperfect disentanglement between expression and style in their personalized training. The reliance of extensive paired captures (HMC and dome) for the same subject makes it operationally expensive to collect large-scale datasets, which cannot be reused for different HMC viewpoints and lighting. In this work, we propose a novel generative approach, Generative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Face Recognition and Perception