See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis

Jaehyun Park; Minyoung Ahn; Minkyu Kim; Jonghyun Lee; Jae-Gil Lee; Dongmin Park

arXiv:2602.20951·cs.CV·March 27, 2026

See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis

Jaehyun Park, Minyoung Ahn, Minkyu Kim, Jonghyun Lee, Jae-Gil Lee, Dongmin Park

PDF

Open Access 1 Datasets

TL;DR

This paper introduces ArtiAgent, an automated system that synthesizes and annotates images with visual artifacts to improve the training and evaluation of vision-language models and diffusion models.

Contribution

ArtiAgent automates the creation of large-scale artifact-annotated datasets, reducing reliance on costly human labeling and enabling scalable artifact mitigation in image generation models.

Findings

01

Synthesized 100K images with artifact annotations

02

Demonstrated effectiveness across diverse applications

03

Enabled improved artifact detection and mitigation

Abstract

Despite recent advances in diffusion models, AI generated images still often contain visual artifacts that compromise realism. Although more thorough pre-training and bigger models might reduce artifacts, there is no assurance that they can be completely eliminated, which makes artifact mitigation a highly crucial area of study. Previous artifact-aware methodologies depend on human-labeled artifact datasets, which are costly and difficult to scale, underscoring the need for an automated approach to reliably acquire artifact-annotated datasets. In this paper, we propose ArtiAgent, which efficiently creates pairs of real and artifact-injected images. It comprises three agents: a perception agent that recognizes and grounds entities and subentities from real images, a synthesis agent that introduces artifacts via artifact injection tools through novel patch-wise embedding manipulation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

KRAFTON/ArtiBench
dataset· 72 dl
72 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Cell Image Analysis Techniques · Domain Adaptation and Few-Shot Learning