Trinity Detector:text-assisted and attention mechanisms based spectral fusion for diffusion generation image detection
Jiawei Song, Dengpan Ye, Yunming Zhang

TL;DR
The paper introduces Trinity Detector, a novel multimodal method combining text and spectral features to effectively detect images generated by diffusion models, addressing challenges in forgery detection.
Contribution
It proposes a new diffusion-specific forgery detection approach that integrates text features with spectral and pixel artifacts using attention mechanisms.
Findings
Outperforms state-of-the-art detection methods
Achieves up to 17.6% improvement in transferability
Demonstrates robustness across multiple datasets
Abstract
Artificial Intelligence Generated Content (AIGC) techniques, represented by text-to-image generation, have led to a malicious use of deep forgeries, raising concerns about the trustworthiness of multimedia content. Adapting traditional forgery detection methods to diffusion models proves challenging. Thus, this paper proposes a forgery detection method explicitly designed for diffusion models called Trinity Detector. Trinity Detector incorporates coarse-grained text features through a CLIP encoder, coherently integrating them with fine-grained artifacts in the pixel domain for comprehensive multimodal detection. To heighten sensitivity to diffusion-generated image features, a Multi-spectral Channel Attention Fusion Unit (MCAF) is designed, extracting spectral inconsistencies through adaptive fusion of diverse frequency bands and further integrating spatial co-occurrence of the two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Fusion Techniques · Geochemistry and Geologic Mapping · Brain Tumor Detection and Classification
MethodsContrastive Language-Image Pre-training · Diffusion
