Conditional Compatibility Learning for Context-Dependent Anomaly Detection
Shashank Mishra, Didier Stricker, Jason Rambach

TL;DR
This paper introduces a new approach called conditional compatibility learning for detecting contextual anomalies, addressing the limitations of traditional anomaly detection methods that ignore context.
Contribution
The paper proposes a formal framework and a vision-language model, CC-CLIP, that learns disentangled subject and context representations for improved contextual anomaly detection.
Findings
CC-CLIP achieves state-of-the-art results on real-world contextual anomaly detection datasets.
A single-branch variant of CC-CLIP performs competitively on structural anomaly benchmarks.
The approach outperforms existing CLIP-based and context-reasoning baselines.
Abstract
Anomaly detection usually assumes that abnormality is an intrinsic property of an observation. A defect is a defect, and a rare object is rare, regardless of where it appears. Many real-world anomalies do not work this way. A runner on a track is normal, but the same runner on a highway is not. The subject is unchanged; only the context makes it anomalous. This setting, long recognized as contextual anomaly detection, remains largely underexplored in modern vision-language systems. The difficulty is not merely empirical; it is formal. When anomaly labels depend on the relation between a subject and its context, any detector reasoning from a global representation that conflates subject and context is provably non-identifiable: two different subject-context configurations can map to the same embedding while requiring opposite labels, and no such detector can be correct on both. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
