FakeTransformer: Exposing Face Forgery From Spatial-Temporal   Representation Modeled By Facial Pixel Variations

Yuyang Sun; Zhiyong Zhang; Changzhen Qiu; Liang Wang; Zekai Wang

arXiv:2111.07601·cs.CV·November 16, 2021

FakeTransformer: Exposing Face Forgery From Spatial-Temporal Representation Modeled By Facial Pixel Variations

Yuyang Sun, Zhiyong Zhang, Changzhen Qiu, Liang Wang, Zekai Wang

PDF

Open Access

TL;DR

This paper introduces FakeTransformer, a novel method that detects DeepFake videos by analyzing facial pixel variations related to physiological signals using multi-scale Eulerian magnification and vision Transformers, achieving high accuracy and generalization.

Contribution

The paper proposes a new approach combining physiological signal magnification and vision Transformers for face forgery detection, enhancing robustness and cross-domain performance.

Findings

01

High detection accuracy on FaceForensics++ and DeepFake Detection datasets

02

Strong generalization capability across different data domains

03

Effective identification of physiological inconsistencies in fake videos

Abstract

With the rapid development of generation model, AI-based face manipulation technology, which called DeepFakes, has become more and more realistic. This means of face forgery can attack any target, which poses a new threat to personal privacy and property security. Moreover, the misuse of synthetic video shows potential dangers in many areas, such as identity harassment, pornography and news rumors. Inspired by the fact that the spatial coherence and temporal consistency of physiological signal are destroyed in the generated content, we attempt to find inconsistent patterns that can distinguish between real videos and synthetic videos from the variations of facial pixels, which are highly related to physiological information. Our approach first applies Eulerian Video Magnification (EVM) at multiple Gaussian scales to the original video to enlarge the physiological variations caused by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Face recognition and analysis

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Residual Connection · Softmax · Position-Wise Feed-Forward Layer · Adam · Dense Connections · Layer Normalization