Res$^2$CLIP: Few-Shot Generalist Anomaly Detection with Residual-to-Residual Alignment

Xinyue Liu; Jianyuan Wang; Biao Leng; Shuo Zhang

arXiv:2605.16171·cs.CV·May 18, 2026

Res$^2$CLIP: Few-Shot Generalist Anomaly Detection with Residual-to-Residual Alignment

Xinyue Liu, Jianyuan Wang, Biao Leng, Shuo Zhang

PDF

1 Repo

TL;DR

Res$^2$CLIP introduces a residual-to-residual alignment framework within CLIP to improve few-shot generalist anomaly detection, effectively handling fine-grained differences and preserving open-world generalization.

Contribution

It is the first to propose a residual-to-residual alignment approach that symmetrically bridges visual and text modalities within CLIP's residual space for anomaly detection.

Findings

01

Effective in multiple datasets for anomaly detection.

02

Addresses fine-grained normal feature differences.

03

Maintains CLIP's open-world generalization.

Abstract

Few-shot Generalist Anomaly Detection requires models to generalize to novel categories without retraining, posing significant challenges in real-world scenarios with scarce samples and rapidly changing categories. Existing CLIP-based methods face two major challenges: coarse-grained unified text prompts struggle to adapt to fine-grained foreground-background differences, causing cross-granularity mismatch; and fine-tuning on auxiliary datasets disrupts CLIP's inherent open-world generalization due to domain shift, leading to cross-category generalization degradation. To address these, we propose to shift multimodal alignment entirely into a unified residual space, where residual representations naturally eliminate fine-grained normal feature differences across regions and class-specific biases, simultaneously resolving both problems. Based on this insight, Res $^{2}$ CLIP, the first…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hito2448/Res2CLIP
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.