From Blurry to Believable: Enhancing Low-quality Talking Heads with 3D Generative Priors

Ding-Jiun Huang; Yuanhao Wang; Shao-Ji Yuan; Albert Mosella-Montoro; Francisco Vicente Carrasco; Cheng Zhang; Fernando De la Torre

arXiv:2602.06122·cs.CV·February 9, 2026

From Blurry to Believable: Enhancing Low-quality Talking Heads with 3D Generative Priors

Ding-Jiun Huang, Yuanhao Wang, Shao-Ji Yuan, Albert Mosella-Montoro, Francisco Vicente Carrasco, Cheng Zhang, Fernando De la Torre

PDF

Open Access

TL;DR

SuperHead is a novel framework that enhances low-quality 3D talking head avatars by leveraging pre-trained 3D generative models and a dynamics-aware inversion scheme, producing high-fidelity, animatable, and consistent 3D heads from low-res sources.

Contribution

It introduces a new method combining 3D generative priors with a dynamics-aware inversion for super-resolution of low-quality 3D head models, ensuring realism and consistency during animation.

Findings

01

SuperHead outperforms baseline methods in visual quality.

02

It produces detailed and realistic facial features under dynamic motions.

03

The approach maintains subject identity and temporal consistency.

Abstract

Creating high-fidelity, animatable 3D talking heads is crucial for immersive applications, yet often hindered by the prevalence of low-quality image or video sources, which yield poor 3D reconstructions. In this paper, we introduce SuperHead, a novel framework for enhancing low-resolution, animatable 3D head avatars. The core challenge lies in synthesizing high-quality geometry and textures, while ensuring both 3D and temporal consistency during animation and preserving subject identity. Despite recent progress in image, video and 3D-based super-resolution (SR), existing SR techniques are ill-equipped to handle dynamic 3D inputs. To address this, SuperHead leverages the rich priors from pre-trained 3D generative models via a novel dynamics-aware 3D inversion scheme. This process optimizes the latent representation of the generative model to produce a super-resolved 3D Gaussian Splatting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Speech and Audio Processing