Extracting deep local features to detect manipulated images of human faces
Michail Tarasiou, Stefanos Zafeiriou

TL;DR
This paper introduces a lightweight, multitask neural network architecture that leverages local image features to effectively detect manipulated face images, achieving state-of-the-art results with fewer parameters.
Contribution
The authors propose a novel local feature-based approach with a specialized architecture and training scheme for improved manipulation detection.
Findings
State-of-the-art accuracy on FaceForensics++ dataset
Reduced model complexity with fewer parameters
Effective detection of fully generated face images
Abstract
Recent developments in computer vision and machine learning have made it possible to create realistic manipulated videos of human faces, raising the issue of ensuring adequate protection against the malevolent effects unlocked by such capabilities. In this paper we propose local image features that are shared across manipulated regions are the key element for the automatic detection of manipulated face images. We also design a lightweight architecture with the correct structural biases for extracting such features and derive a multitask training scheme that consistently outperforms image class supervision alone. The trained networks achieve state-of-the-art results in the FaceForensics++ dataset using significantly reduced number of parameters and are shown to work well in detecting fully generated face images.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
