ImagiNet: A Multi-Content Benchmark for Synthetic Image Detection
Delyan Boychev, Radostin Cholakov

TL;DR
ImagiNet introduces a comprehensive, balanced dataset of 200K synthetic images across multiple content types to improve the generalizability of synthetic image detectors, along with a strong baseline model achieving high accuracy.
Contribution
The paper presents ImagiNet, a diverse and balanced synthetic image dataset with a two-track evaluation system, and establishes a robust baseline model for detection tasks.
Findings
High detection accuracy with AUC up to 0.99
Balanced content types improve generalizability
Model performs well under compression and resizing
Abstract
Recent generative models produce images with a level of authenticity that makes them nearly indistinguishable from real photos and artwork. Potential harmful use cases of these models, necessitate the creation of robust synthetic image detectors. However, current datasets in the field contain generated images with questionable quality or have examples from one predominant content type which leads to poor generalizability of the underlying detectors. We find that the curation of a balanced amount of high-resolution generated images across various content types is crucial for the generalizability of detectors, and introduce ImagiNet, a dataset of 200K examples, spanning four categories: photos, paintings, faces, and miscellaneous. Synthetic images in ImagiNet are produced with both open-source and proprietary generators, whereas real counterparts for each content type are collected from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · AI in cancer detection
MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
