CLIP-FTI: Fine-Grained Face Template Inversion via CLIP-Driven Attribute Conditioning

Longchen Dai; Zixuan Shen; Zhiheng Zhou; Peipeng Yu; Zhihua Xia

arXiv:2512.15433·cs.CV·December 18, 2025

CLIP-FTI: Fine-Grained Face Template Inversion via CLIP-Driven Attribute Conditioning

Longchen Dai, Zixuan Shen, Zhiheng Zhou, Peipeng Yu, Zhihua Xia

PDF

Open Access

TL;DR

CLIP-FTI introduces a novel face template inversion method leveraging CLIP semantic embeddings and StyleGAN to produce more detailed, accurate, and transferable face reconstructions, enhancing privacy risks in face recognition systems.

Contribution

The paper presents the first CLIP-driven framework for fine-grained face template inversion, improving attribute fidelity and transferability over prior methods.

Findings

01

Achieves higher identification accuracy and attribute similarity

02

Recovers sharper, component-level facial attributes

03

Enhances cross-model attack transferability

Abstract

Face recognition systems store face templates for efficient matching. Once leaked, these templates pose a threat: inverting them can yield photorealistic surrogates that compromise privacy and enable impersonation. Although existing research has achieved relatively realistic face template inversion, the reconstructed facial images exhibit over-smoothed facial-part attributes (eyes, nose, mouth) and limited transferability. To address this problem, we present CLIP-FTI, a CLIP-driven fine-grained attribute conditioning framework for face template inversion. Our core idea is to use the CLIP model to obtain the semantic embeddings of facial features, in order to realize the reconstruction of specific facial feature attributes. Specifically, facial feature attribute embeddings extracted from CLIP are fused with the leaked template via a cross-modal feature interaction network and projected…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning