# A Hierarchical Multi-Resolution Self-Supervised Framework for High-Fidelity 3D Face Reconstruction Using Learnable Gabor-Aware Texture Modeling

**Authors:** Pichet Mareo, Rerkchai Fooprateepsiri

PMC · DOI: 10.3390/jimaging12010026 · 2026-01-05

## TL;DR

This paper introduces a new framework for reconstructing high-quality 3D faces from single images by using a multi-scale approach and texture modeling.

## Contribution

The novel contribution is a hierarchical self-supervised framework with a learnable Gabor-aware texture module for high-fidelity 3D face reconstruction.

## Key findings

- The proposed framework outperforms existing methods in fine-detail reconstruction of 3D faces.
- The hierarchical design improves semantic consistency across multiple geometric scales.
- The Gabor-aware module effectively decouples spatial-frequency information for better texture fidelity.

## Abstract

High-fidelity 3D face reconstruction from a single image is challenging, owing to the inherently ambiguous depth cues and the strong entanglement of multi-scale facial textures. In this regard, we propose a hierarchical multi-resolution self-supervised framework (HMR-Framework), which reconstructs coarse-, medium-, and fine-scale facial geometry progressively through a unified pipeline. A coarse geometric prior is first estimated via 3D morphable model regression, followed by medium-scale refinement using a vertex deformation map constrained by a global–local Markov random field loss to preserve structural coherence. In order to improve fine-scale fidelity, a learnable Gabor-aware texture enhancement module has been proposed to decouple spatial–frequency information and thus improve sensitivity for high-frequency facial attributes. Additionally, we employ a wavelet-based detail perception loss to preserve the edge-aware texture features while mitigating noise commonly observed in in-the-wild images. Extensive qualitative and quantitative evaluation of benchmark datasets indicate that the proposed framework provides better fine-detail reconstruction than existing state-of-the-art methods, while maintaining robustness over pose variations. Notably, the hierarchical design increases semantic consistency across multiple geometric scales, providing a functional solution for high-fidelity 3D face reconstruction from monocular images.

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12843383/full.md

---
Source: https://tomesphere.com/paper/PMC12843383