CausalCLIP: Causally-Informed Feature Disentanglement and Filtering for Generalizable Detection of Generated Images

Bo Liu; Qiao Qin; Qinghui He

arXiv:2512.13285·cs.CV·March 24, 2026

CausalCLIP: Causally-Informed Feature Disentanglement and Filtering for Generalizable Detection of Generated Images

Bo Liu, Qiao Qin, Qinghui He

PDF

Open Access 1 Video

TL;DR

CausalCLIP introduces a causally-informed framework that disentangles and filters features to improve the generalization of generated image detectors across diverse models.

Contribution

It presents a novel causal disentanglement approach using structural causal models and independence constraints to enhance detection robustness.

Findings

01

Achieves 6.83% higher accuracy on unseen models.

02

Improves average precision by 4.06%.

03

Demonstrates strong generalization across diverse generative techniques.

Abstract

The rapid advancement of generative models has increased the demand for generated image detectors capable of generalizing across diverse and evolving generation techniques. However, existing methods, including those leveraging pre-trained vision-language models, often produce highly entangled representations, mixing task-relevant forensic cues (causal features) with spurious or irrelevant patterns (non-causal features), thus limiting generalization. To address this issue, we propose CausalCLIP, a framework that explicitly disentangles causal from non-causal features and employs targeted filtering guided by causal inference principles to retain only the most transferable and discriminative forensic cues. By modeling the generation process with a structural causal model and enforcing statistical independence through Gumbel-Softmax-based feature masking and Hilbert-Schmidt Independence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

CausalCLIP: Causally-Informed Feature Disentanglement and Filtering for Generalizable Detection of Generated Images· underline

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Face recognition and analysis