FAME: A Lightweight Spatio-Temporal Network for Model Attribution of Face-Swap Deepfakes
Wasim Ahmad, Yan-Tsung Peng, Yuan-Hao Chang

TL;DR
FAME is a lightweight spatio-temporal network that effectively identifies the specific face-swap model used in Deepfake videos, improving accuracy and efficiency for forensic applications.
Contribution
Introduces FAME, a novel efficient framework that captures model-specific artifacts for Deepfake attribution using multilevel embeddings and attention mechanisms.
Findings
Outperforms existing methods in accuracy
Achieves faster runtime
Effective across diverse datasets
Abstract
The widespread emergence of face-swap Deepfake videos poses growing risks to digital security, privacy, and media integrity, necessitating effective forensic tools for identifying the source of such manipulations. Although most prior research has focused primarily on binary Deepfake detection, the task of model attribution -- determining which generative model produced a given Deepfake -- remains underexplored. In this paper, we introduce FAME (Fake Attribution via Multilevel Embeddings), a lightweight and efficient spatio-temporal framework designed to capture subtle generative artifacts specific to different face-swap models. FAME integrates spatial and temporal attention mechanisms to improve attribution accuracy while remaining computationally efficient. We evaluate our model on three challenging and diverse datasets: Deepfake Detection and Manipulation (DFDM), FaceForensics++, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis
MethodsSoftmax · Attention Is All You Need
