Exploring Specular Reflection Inconsistency for Generalizable Face Forgery Detection
Hongyan Fei, Zexi Jia, Chuanwei Huang, Jinchao Zhang, Jie Zhou

TL;DR
This paper introduces a novel face forgery detection method that leverages inconsistencies in specular reflection, a complex physical attribute, to identify AI-generated deepfakes with high accuracy, especially those created by diffusion models.
Contribution
The paper proposes a new detection approach based on specular reflection inconsistency, utilizing a Retinex-based texture estimation and a cross-attention network to improve forgery detection robustness.
Findings
Achieves superior detection performance on multiple deepfake datasets.
Effectively detects diffusion-generated forgeries with high accuracy.
Demonstrates the importance of physical attribute inconsistencies in forgery detection.
Abstract
Detecting deepfakes has become increasingly challenging as forgery faces synthesized by AI-generated methods, particularly diffusion models, achieve unprecedented quality and resolution. Existing forgery detection approaches relying on spatial and frequency features demonstrate limited efficacy against high-quality, entirely synthesized forgeries. In this paper, we propose a novel detection method grounded in the observation that facial attributes governed by complex physical laws and multiple parameters are inherently difficult to replicate. Specifically, we focus on illumination, particularly the specular reflection component in the Phong illumination model, which poses the greatest replication challenge due to its parametric complexity and nonlinear formulation. We introduce a fast and accurate face texture estimation method based on Retinex theory to enable precise specular…
Peer Reviews
Decision·ICLR 2026 Poster
- This manuscript is well-structured and easy to follow. - This work provides a clear physical motivation by identifying specular reflection as the most complex component in generated faces, which is a reasonable choice grounded in physical principles.
- Retinex-based reflectance estimation has already been used in face forensics and related vision tasks[1,2] for exposing subtle forgery cues, so the usage of Multi-Scale Retinex is not entirely novel. - This work does not sufficiently validate robustness to common post-processing, such as Gaussian Blur and JPEG compression. [1] Attention-based Two-stream Convolutional Networks for Face Spoofing Detection. IEEE TIFS 2020. [2] Exposing Face Forgery Clues via Retinex-based Image Enhancement. ACC
The manuscript is clearly written and easy to follow, with a concise method formulation and a framework that is simple yet demonstrably effective. The method achieves consistently strong results across a broad spectrum of face-swapping techniques, including both diffusion-based and GAN-based pipelines, suggesting good generalization beyond a single generator family. The ablation studies provide compelling evidence that each branch, as well as the method’s design/strategy choices, contributes pos
1.In Section 3.2, the choice to employ Retinex theory and Multi-Scale Retinex (MSR) to achieve a 'smaller to the real texture' and enhance robustness is a major component of the methodology. However, these Retinex-based enhancement techniques are well-established, and their application alone does not appear to be a novel contribution. More critically, previous work [1], has utilized similar Retinex-based methods for image enhancement, yet this paper is not cited. 2.The work is motivated by the
- Novelty The paper introduces a unique perspective by focusing on specular reflection as a forgery indicator, which is grounded in well-established physical principles. This approach provides a fresh direction in deepfake detection research. The use of the Phong illumination model and Retinex theory for texture extraction is innovative and enhances the accuracy of specular reflection separation. - Strong Experimental Results The method achieves superior frame-level and video-level AUC score
- Limited Analysis of Failure Cases While the paper briefly discusses misclassification due to extreme facial poses or occlusion, it does not provide detailed insights into how these issues could be mitigated or addressed in future work. - Dependence on 3D Shape Fitting The method relies heavily on accurate 3D shape fitting to extract specular reflection. In scenarios with severe occlusion or extreme poses, the performance may degrade significantly. The paper does not explore alternative solu
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Face recognition and analysis
