Standing on the Shoulders of Giants: Reprogramming Visual-Language Model   for General Deepfake Detection

Kaiqing Lin; Yuzhen Lin; Weixiang Li; Taiping Yao; Bin Li

arXiv:2409.02664·cs.CV·April 14, 2025

Standing on the Shoulders of Giants: Reprogramming Visual-Language Model for General Deepfake Detection

Kaiqing Lin, Yuzhen Lin, Weixiang Li, Taiping Yao, Bin Li

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel zero-shot deepfake detection method that reprograms pre-trained vision-language models like CLIP through input perturbations, significantly improving cross-dataset and cross-manipulation detection performance without extensive retraining.

Contribution

It proposes a reprogramming approach that manipulates input to adapt pre-trained VLMs for general deepfake detection, enhancing robustness and reducing training complexity.

Findings

01

Over 88% AUC in cross-dataset detection

02

Significant performance improvements across multiple benchmarks

03

Fewer trainable parameters needed for effective detection

Abstract

The proliferation of deepfake faces poses huge potential negative impacts on our daily lives. Despite substantial advancements in deepfake detection over these years, the generalizability of existing methods against forgeries from unseen datasets or created by emerging generative models remains constrained. In this paper, inspired by the zero-shot advantages of Vision-Language Models (VLMs), we propose a novel approach that repurposes a well-trained VLM for general deepfake detection. Motivated by the model reprogramming paradigm that manipulates the model prediction via input perturbations, our method can reprogram a pre-trained VLM model (e.g., CLIP) solely based on manipulating its input without tuning the inner parameters. First, learnable visual perturbations are used to refine feature extraction for deepfake detection. Then, we exploit information of face embedding to create…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Standing on the Shoulders of Giants: Reprogramming Visual-Language Model for General Deepfake Detection· underline

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis

MethodsContrastive Language-Image Pre-training