FIDAVL: Fake Image Detection and Attribution using Vision-Language Model

Mamadou Keita; Wassim Hamidouche; Hessen Bougueffa Eutamene,; Abdelmalik Taleb-Ahmed; Abdenour Hadid

arXiv:2409.03109·cs.CV·September 6, 2024

FIDAVL: Fake Image Detection and Attribution using Vision-Language Model

Mamadou Keita, Wassim Hamidouche, Hessen Bougueffa Eutamene,, Abdelmalik Taleb-Ahmed, Abdenour Hadid

PDF

Open Access 1 Repo

TL;DR

FIDAVL is a novel vision-language model that effectively detects fake images and attributes them to their source models using zero-shot learning and soft prompt-tuning, achieving high accuracy and F1-scores.

Contribution

The paper introduces FIDAVL, a new multitask approach leveraging vision-language synergy and soft prompt-tuning for fake image detection and attribution, with state-of-the-art performance.

Findings

01

Achieves 95.42% detection accuracy on synthetic images.

02

Attains 95.47% F1-score in fake image attribution.

03

Demonstrates strong performance across diverse synthetic image sources.

Abstract

We introduce FIDAVL: Fake Image Detection and Attribution using a Vision-Language Model. FIDAVL is a novel and efficient mul-titask approach inspired by the synergies between vision and language processing. Leveraging the benefits of zero-shot learning, FIDAVL exploits the complementarity between vision and language along with soft prompt-tuning strategy to detect fake images and accurately attribute them to their originating source models. We conducted extensive experiments on a comprehensive dataset comprising synthetic images generated by various state-of-the-art models. Our results demonstrate that FIDAVL achieves an encouraging average detection accuracy of 95.42% and F1-score of 95.47% while also obtaining noteworthy performance metrics, with an average F1-score of 92.64% and ROUGE-L score of 96.50% for attributing synthetic images to their respective source generation models. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mamadou-keita/fidavl
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Digital Media Forensic Detection