Out of Context: Reliability in Multimodal Anomaly Detection Requires Contextual Inference
Kevin Wilkinghoff, Neelu Madan, Juan Miguel Valverde, Kamal Nasrollahi, Radu Tudor Ionescu, Rafal Wisniewski, Thomas B. Moeslund, Wenwu Wang, Zheng-Hua Tan

TL;DR
This paper emphasizes the importance of contextual inference in multimodal anomaly detection, arguing that considering operating conditions improves reliability over traditional fixed-reference models.
Contribution
It proposes reframing multimodal anomaly detection as a cross-modal contextual inference problem, highlighting the roles of different modalities in defining abnormality.
Findings
Traditional models assume a single normality distribution, leading to instability in dynamic environments.
Context-aware approaches can better distinguish genuine anomalies from normal variations.
The paper outlines new evaluation protocols and benchmarks for context-aware multimodal anomaly detection.
Abstract
Anomaly detection aims to identify observations that deviate from expected behavior. Because anomalous events are inherently sparse, most frameworks are trained exclusively on normal data to learn a single reference model of normality. This implicitly assumes that normal behavior can be captured by a single, unconditional reference distribution. In practice, however, anomalies are often context-dependent: A specific observation may be normal under one operating condition, yet anomalous under another. As machine learning systems are deployed in dynamic and heterogeneous environments, these fixed-context assumptions introduce structural ambiguity, i.e., the inability to distinguish contextual variation from genuine abnormality under marginal modeling, leading to unstable performance and unreliable anomaly assessments. While modern sensing systems frequently collect multimodal data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
