VCP-CLIP: A visual context prompting model for zero-shot anomaly   segmentation

Zhen Qu; Xian Tao; Mukesh Prasad; Fei Shen; Zhengtao Zhang; Xinyi; Gong; Guiguang Ding

arXiv:2407.12276·cs.CV·July 18, 2024

VCP-CLIP: A visual context prompting model for zero-shot anomaly segmentation

Zhen Qu, Xian Tao, Mukesh Prasad, Fei Shen, Zhengtao Zhang, Xinyi, Gong, Guiguang Ding

PDF

Open Access 1 Repo

TL;DR

VCP-CLIP introduces a visual context prompting approach that enhances zero-shot anomaly segmentation by embedding global visual information into prompts, eliminating the need for product-specific prompts and achieving state-of-the-art results.

Contribution

The paper presents a novel visual context prompting method for CLIP that improves zero-shot anomaly segmentation without product-specific prompts.

Findings

01

Achieved state-of-the-art performance on 10 industrial datasets.

02

Eliminated the need for product-specific text prompts.

03

Enhanced CLIP's anomaly detection capabilities through visual context embedding.

Abstract

Recently, large-scale vision-language models such as CLIP have demonstrated immense potential in zero-shot anomaly segmentation (ZSAS) task, utilizing a unified model to directly detect anomalies on any unseen product with painstakingly crafted text prompts. However, existing methods often assume that the product category to be inspected is known, thus setting product-specific text prompts, which is difficult to achieve in the data privacy scenarios. Moreover, even the same type of product exhibits significant differences due to specific components and variations in the production process, posing significant challenges to the design of text prompts. In this end, we propose a visual context prompting model (VCP-CLIP) for ZSAS task based on CLIP. The insight behind VCP-CLIP is to employ visual context prompting to activate CLIP's anomalous semantic perception ability. In specific, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xiaozhen228/vcp-clip
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Influenza Virus Research Studies

MethodsContrastive Language-Image Pre-training