IPAD-CLIP: Teaching CLIP to Detect Image Local Perceptual Artifacts

Juan Wang; Xinyu Sun; Ke Zhang; Jin Wang; Bing Li; Weiming Hu; Liang Wang

arXiv:2605.08664·cs.CV·May 12, 2026

IPAD-CLIP: Teaching CLIP to Detect Image Local Perceptual Artifacts

Juan Wang, Xinyu Sun, Ke Zhang, Jin Wang, Bing Li, Weiming Hu, Liang Wang

PDF

TL;DR

This paper introduces IPAD-CLIP, a novel framework based on CLIP for detecting local perceptual artifacts in images, supported by a new benchmark dataset and demonstrating superior performance over existing methods.

Contribution

The paper formalizes the IPAD task, provides a new dataset with pixel-level masks for artifacts, and develops IPAD-CLIP, a CLIP-based model that improves local artifact detection.

Findings

01

IPAD-CLIP significantly outperforms existing anomaly detection methods.

02

The dataset includes 3,520 images with pixel-level artifact masks.

03

Local artifacts are better detected using artifact-aware text embeddings.

Abstract

Current image quality assessment methods are heavily biased towards global distortions (e.g., noise, blur), neglecting local perceptual artifacts such as ghosting, lens flare, and moire effects. Although significant progress has been made in artifact removal, the fundamental problem of automatic artifact detection remains largely unexplored. In this paper, we formalize the Image Perceptual Artifact Detection (IPAD) task to address this gap. We contribute a benchmark dataset comprising 3,520 artifact images, including 520 real-captured and 3,000 synthetic samples, each paired with pixel-level masks across three representative artifact categories. The core challenge of IPAD lies in the localized, subtle, and semantically weak nature of these artifacts, which makes them prone to missed detection. To overcome this, we introduce IPAD-CLIP, a novel framework built upon CLIP that enhances…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.