Can We Build a Monolithic Model for Fake Image Detection? SICA: Semantic-Induced Constrained Adaptation for Unified-Yet-Discriminative Artifact Feature Space Reconstruction

Bo Du; Xiaochen Ma; Xuekang Zhu; Zhe Yang; Chaogun Niu; Chenfan Qu; Mingqi Fang; Zhenming Wang; Jingjing Liu; Jian Liu; Ji-Zhe Zhou

arXiv:2602.06676·cs.CV·May 22, 2026

Can We Build a Monolithic Model for Fake Image Detection? SICA: Semantic-Induced Constrained Adaptation for Unified-Yet-Discriminative Artifact Feature Space Reconstruction

Bo Du, Xiaochen Ma, Xuekang Zhu, Zhe Yang, Chaogun Niu, Chenfan Qu, Mingqi Fang, Zhenming Wang, Jingjing Liu, Jian Liu, Ji-Zhe Zhou

PDF

1 Repo

TL;DR

This paper introduces SICA, a novel monolithic fake image detection model that leverages high-level semantics to reconstruct a unified yet discriminative artifact feature space, outperforming existing methods.

Contribution

Proposes the first monolithic FID paradigm, SICA, which uses semantic priors to achieve unified-yet-discriminative artifact feature space reconstruction.

Findings

01

SICA outperforms 15 state-of-the-art methods on OpenMMSec dataset.

02

SICA reconstructs the artifact feature space in a near-orthogonal manner.

03

The approach validates the hypothesis that high-level semantics aid feature space reconstruction.

Abstract

Fake Image Detection (FID), aiming at unified detection across four image forensic subdomains, is critical in real-world forensic scenarios. Compared with ensemble approaches, monolithic FID models are theoretically more promising, but to date, consistently yield inferior performance in practice. In this work, we identify the intrinsic distinctness of artifacts across subdomains, a critical barrier we term the ``Ji-Zhe phenomenon". Driven by this phenomenon, we diagnose the cause of this underperformance for the first time: the collapse of the artifact feature space. The core challenge for developing a practical monolithic FID model thus boils down to the ``unified-yet-discriminative" reconstruction of the artifact feature space. To address this paradoxical challenge, we hypothesize that high-level semantics can serve as a structural prior for the reconstruction, and further propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

venus-guangjian/SICA_OpenMMSec
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.