TL;DR
This paper introduces SciFigDetect, the first benchmark for detecting AI-generated scientific figures, highlighting current methods' limitations in zero-shot and cross-generator scenarios.
Contribution
It develops a comprehensive dataset and evaluation framework specifically for AI-generated scientific figure detection, addressing a gap in existing benchmarks.
Findings
Current detection methods fail in zero-shot transfer settings.
Detectors tend to overfit to specific generators.
Detection robustness decreases with common image corruptions.
Abstract
Modern multimodal generators can now produce scientific figures at near-publishable quality, creating a new challenge for visual forensics and research integrity. Unlike conventional AI-generated natural images, scientific figures are structured, text-dense, and tightly aligned with scholarly semantics, making them a distinct and difficult detection target. However, existing AI-generated image detection benchmarks and methods are almost entirely developed for open-domain imagery, leaving this setting largely unexplored. We present the first benchmark for AI-generated scientific figure detection. To construct it, we develop an agent-based data pipeline that retrieves licensed source papers, performs multimodal understanding of paper text and figures, builds structured prompts, synthesizes candidate figures, and filters them through a review-driven refinement loop. The resulting benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
