AdaptPrompt: Parameter-Efficient Adaptation of VLMs for Generalizable Deepfake Detection
Yichen Jiang, Mohammed Talha Alam, Sohail Ahmed Khan, Duc-Tien Dang-Nguyen, Fakhri Karray

TL;DR
This paper introduces AdaptPrompt, a parameter-efficient framework leveraging CLIP for robust deepfake detection across diverse generative models, supported by a new large-scale diffusion-generated fake dataset and extensive evaluations.
Contribution
It presents a novel transfer learning method with textual prompts and visual adapters, and introduces Diff-Gen, a large diffusion-generated fake dataset for improved generalization.
Findings
Models trained on Diff-Gen outperform others on unseen generators.
Pruning the final transformer layer improves artifact retention and detection accuracy.
The framework achieves state-of-the-art results across multiple test sets and supports few-shot and source attribution tasks.
Abstract
Recent advances in image generation have led to the widespread availability of highly realistic synthetic media, increasing the difficulty of reliable deepfake detection. A key challenge is generalization, as detectors trained on a narrow class of generators often fail when confronted with unseen models. In this work, we address the pressing need for generalizable detection by leveraging large vision-language models, specifically CLIP, to identify synthetic content across diverse generative techniques. First, we introduce Diff-Gen, a large-scale benchmark dataset comprising 100k diffusion-generated fakes that capture broad spectral artifacts unlike traditional GAN datasets. Models trained on Diff-Gen demonstrate stronger cross-domain generalization, particularly on previously unseen image generators. Second, we propose AdaptPrompt, a parameter-efficient transfer learning framework that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Face recognition and analysis
