SPG: Sparse-Projected Guides with Sparse Autoencoders for Zero-Shot Anomaly Detection
Tomoyasu Nanaumi, Yukino Tsuzuki, Junichi Okubo, Junichiro Fujii, Takayoshi Yamashita

TL;DR
This paper introduces SPG, a novel zero-shot anomaly detection framework that leverages sparse autoencoders and learned guide coefficients, eliminating the need for prompt engineering.
Contribution
SPG is a prompt-free, two-stage learning approach that uses sparse autoencoders to generate guide vectors for zero-shot anomaly detection and segmentation.
Findings
SPG achieves competitive image-level detection on MVTec AD and VisA datasets.
SPG attains the highest pixel-level AUROC with DINOv3 among compared methods.
Learned guide coefficients reveal category-general and category-specific factors.
Abstract
We study zero-shot anomaly detection and segmentation using frozen foundation model features, where all learnable parameters are trained only on a labeled auxiliary dataset and deployed to unseen target categories without any target-domain adaptation. Existing prompt-based approaches use handcrafted or learned prompt embeddings as reference vectors for normal/anomalous states. We propose Sparse-Projected Guides (SPG), a prompt-free framework that learns sparse guide coefficients in the Sparse Autoencoder (SAE) latent space, which generate normal/anomaly guide vectors via the SAE dictionary. SPG employs a two stage learning strategy on the labeled auxiliary dataset: (i) train an SAE on patch-token features, and (ii) optimize only guide coefficients using auxiliary pixel-level masks while freezing the backbone and SAE. On MVTec AD and VisA under cross-dataset zero-shot settings, SPG…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
