INSIGHT: An Interpretable Neural Vision-Language Framework for Reasoning of Generative Artifacts

Anshul Bagaria

arXiv:2511.22351·cs.CV·December 1, 2025

INSIGHT: An Interpretable Neural Vision-Language Framework for Reasoning of Generative Artifacts

Anshul Bagaria

PDF

Open Access

TL;DR

INSIGHT is a multimodal framework that enhances detection and provides transparent, human-interpretable explanations for AI-generated images, even at very low resolutions, improving trust and reliability in media forensics.

Contribution

It introduces a novel, interpretable, multimodal approach combining super-resolution, localization, semantic alignment, and reasoning protocols for robust AI-generated image detection and explanation.

Findings

01

Outperforms prior detectors in robustness and explanation quality.

02

Effective at extremely low resolutions (16x16 to 64x64).

03

Provides human-interpretable explanations verified by a dual-stage evaluation.

Abstract

The growing realism of AI-generated images produced by recent GAN and diffusion models has intensified concerns over the reliability of visual media. Yet, despite notable progress in deepfake detection, current forensic systems degrade sharply under real-world conditions such as severe downsampling, compression, and cross-domain distribution shifts. Moreover, most detectors operate as opaque classifiers, offering little insight into why an image is flagged as synthetic, undermining trust and hindering adoption in high-stakes settings. We introduce INSIGHT (Interpretable Neural Semantic and Image-based Generative-forensic Hallucination Tracing), a unified multimodal framework for robust detection and transparent explanation of AI-generated images, even at extremely low resolutions (16x16 - 64x64). INSIGHT combines hierarchical super-resolution for amplifying subtle forensic cues…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning