ArtifactLens: Hundreds of Labels Are Enough for Artifact Detection with VLMs
James Burgess, Rameen Abdal, Dan Stoddart, Sergey Tulyakov, Serena Yeung-Levy, Kuan-Chieh Jackson Wang

TL;DR
ArtifactLens leverages pretrained vision-language models with minimal labeled data to effectively detect various image artifacts, outperforming existing methods and generalizing across multiple artifact types and detection tasks.
Contribution
The paper introduces ArtifactLens, a novel system that uses few-shot learning and text instruction optimization to detect image artifacts with minimal labeled data, surpassing prior methods.
Findings
Achieves state-of-the-art performance on five artifact benchmarks.
Requires significantly less labeled data than existing detectors.
Generalizes to diverse artifact types and related detection tasks.
Abstract
Modern image generators produce strikingly realistic images, where only artifacts like distorted hands or warped objects reveal their synthetic origin. Detecting these artifacts is essential: without detection, we cannot benchmark generators or train reward models to improve them. Current detectors fine-tune VLMs on tens of thousands of labeled images, but this is expensive to repeat whenever generators evolve or new artifact types emerge. We show that pretrained VLMs already encode the knowledge needed to detect artifacts - with the right scaffolding, this capability can be unlocked using only a few hundred labeled examples per artifact category. Our system, ArtifactLens, achieves state-of-the-art on five human artifact benchmarks (the first evaluation across multiple datasets) while requiring orders of magnitude less labeled data. The scaffolding consists of a multi-component…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis
