PromptMAD: Cross-Modal Prompting for Multi-Class Visual Anomaly Localization
Duncan McCain, Hossein Kashiani, Fatemeh Afghah

TL;DR
PromptMAD introduces a cross-modal prompting framework that leverages vision-language alignment and semantic guidance to enhance unsupervised multi-class visual anomaly detection and localization, achieving state-of-the-art results.
Contribution
It proposes a novel cross-modal prompting approach combining CLIP-based semantic guidance, a specialized segmentor, and Focal loss to improve anomaly detection across diverse categories.
Findings
Achieves 98.35% mean AUC on MVTec-AD
Attains 66.54% AP, outperforming previous methods
Demonstrates robustness across multiple object categories
Abstract
Visual anomaly detection in multi-class settings poses significant challenges due to the diversity of object categories, the scarcity of anomalous examples, and the presence of camouflaged defects. In this paper, we propose PromptMAD, a cross-modal prompting framework for unsupervised visual anomaly detection and localization that integrates semantic guidance through vision-language alignment. By leveraging CLIP-encoded text prompts describing both normal and anomalous class-specific characteristics, our method enriches visual reconstruction with semantic context, improving the detection of subtle and textural anomalies. To further address the challenge of class imbalance at the pixel level, we incorporate Focal loss function, which emphasizes hard-to-detect anomalous regions during training. Our architecture also includes a supervised segmentor that fuses multi-scale convolutional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
