Beyond Human-prompting: Adaptive Prompt Tuning with Semantic Alignment for Anomaly Detection

Pi-Wei Chen; Jerry Chun-Wei Lin; Wei-Han Chen; Jia Ji; Zih-Ching Chen; Feng-Hao Yeh; Chao-Chun Chen

arXiv:2508.16157·cs.CV·August 25, 2025

Beyond Human-prompting: Adaptive Prompt Tuning with Semantic Alignment for Anomaly Detection

Pi-Wei Chen, Jerry Chun-Wei Lin, Wei-Han Chen, Jia Ji, Zih-Ching Chen, Feng-Hao Yeh, Chao-Chun Chen

PDF

TL;DR

This paper introduces APT, a novel few-shot, knowledge-free framework that uses adaptive prompt tuning with semantic alignment and synthetic anomalies to improve context-specific anomaly detection with vision-language models.

Contribution

It presents a new adaptive prompt tuning method with semantic alignment and synthetic anomalies, overcoming limitations of human-designed prompts in anomaly detection.

Findings

01

Achieves state-of-the-art results on multiple benchmarks.

02

Effectively captures context-dependent anomalies.

03

Operates without prior knowledge or human-crafted prompts.

Abstract

Pre-trained Vision-Language Models (VLMs) have recently shown promise in detecting anomalies. However, previous approaches are fundamentally limited by their reliance on human-designed prompts and the lack of accessible anomaly samples, leading to significant gaps in context-specific anomaly understanding. In this paper, we propose \textbf{A}daptive \textbf{P}rompt \textbf{T}uning with semantic alignment for anomaly detection (APT), a groundbreaking prior knowledge-free, few-shot framework and overcomes the limitations of traditional prompt-based approaches. APT uses self-generated anomaly samples with noise perturbations to train learnable prompts that capture context-dependent anomalies in different scenarios. To prevent overfitting to synthetic noise, we propose a Self-Optimizing Meta-prompt Guiding Scheme (SMGS) that iteratively aligns the prompts with general anomaly semantics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.