FakeTransformer: Exposing Face Forgery From Spatial-Temporal Representation Modeled By Facial Pixel Variations
Yuyang Sun, Zhiyong Zhang, Changzhen Qiu, Liang Wang, Zekai Wang

TL;DR
This paper introduces FakeTransformer, a novel method that detects DeepFake videos by analyzing facial pixel variations related to physiological signals using multi-scale Eulerian magnification and vision Transformers, achieving high accuracy and generalization.
Contribution
The paper proposes a new approach combining physiological signal magnification and vision Transformers for face forgery detection, enhancing robustness and cross-domain performance.
Findings
High detection accuracy on FaceForensics++ and DeepFake Detection datasets
Strong generalization capability across different data domains
Effective identification of physiological inconsistencies in fake videos
Abstract
With the rapid development of generation model, AI-based face manipulation technology, which called DeepFakes, has become more and more realistic. This means of face forgery can attack any target, which poses a new threat to personal privacy and property security. Moreover, the misuse of synthetic video shows potential dangers in many areas, such as identity harassment, pornography and news rumors. Inspired by the fact that the spatial coherence and temporal consistency of physiological signal are destroyed in the generated content, we attempt to find inconsistent patterns that can distinguish between real videos and synthetic videos from the variations of facial pixels, which are highly related to physiological information. Our approach first applies Eulerian Video Magnification (EVM) at multiple Gaussian scales to the original video to enlarge the physiological variations caused by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Face recognition and analysis
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Residual Connection · Softmax · Position-Wise Feed-Forward Layer · Adam · Dense Connections · Layer Normalization
