Can ChatGPT Perform Image Splicing Detection? A Preliminary Study

Souradip Nath

arXiv:2506.05358·cs.CV·June 9, 2025

Can ChatGPT Perform Image Splicing Detection? A Preliminary Study

Souradip Nath

PDF

Open Access 1 Repo

TL;DR

This study explores GPT-4V's ability to detect image splicing manipulations without fine-tuning, showing promising zero-shot performance and the ability to leverage contextual knowledge for forensic analysis.

Contribution

It demonstrates GPT-4V's out-of-the-box capabilities in image forensics, highlighting its potential as a flexible tool for splicing detection using various prompting strategies.

Findings

01

GPT-4V achieves over 85% accuracy in zero-shot detection.

02

Chain-of-Thought prompting improves detection balance.

03

Model uses contextual and visual cues for artifact identification.

Abstract

Multimodal Large Language Models (MLLMs) like GPT-4V are capable of reasoning across text and image modalities, showing promise in a variety of complex vision-language tasks. In this preliminary study, we investigate the out-of-the-box capabilities of GPT-4V in the domain of image forensics, specifically, in detecting image splicing manipulations. Without any task-specific fine-tuning, we evaluate GPT-4V using three prompting strategies: Zero-Shot (ZS), Few-Shot (FS), and Chain-of-Thought (CoT), applied over a curated subset of the CASIA v2.0 splicing dataset. Our results show that GPT-4V achieves competitive detection performance in zero-shot settings (more than 85% accuracy), with CoT prompting yielding the most balanced trade-off across authentic and spliced images. Qualitative analysis further reveals that the model not only detects low-level visual artifacts but also draws upon…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

confusedDip/LLM-Image-Splicing-Detection
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Cell Image Analysis Techniques

MethodsChain-of-thought prompting