Harnessing Large Language Models for Training-free Video Anomaly   Detection

Luca Zanella; Willi Menapace; Massimiliano Mancini; Yiming Wang; Elisa; Ricci

arXiv:2404.01014·cs.CV·April 2, 2024·2 cites

Harnessing Large Language Models for Training-free Video Anomaly Detection

Luca Zanella, Willi Menapace, Massimiliano Mancini, Yiming Wang, Elisa, Ricci

PDF

Open Access

TL;DR

This paper introduces LAVAD, a training-free video anomaly detection method that uses pre-trained language and vision-language models to generate descriptions, score anomalies, and outperform traditional training-based approaches on surveillance datasets.

Contribution

LAVAD is the first training-free VAD approach leveraging pre-trained models for description and anomaly scoring, eliminating the need for domain-specific training.

Findings

01

Outperforms unsupervised and one-class methods on UCF-Crime and XD-Violence datasets.

02

Does not require any training or data collection for deployment.

03

Effective use of cross-modal similarity for caption cleaning and score refinement.

Abstract

Video anomaly detection (VAD) aims to temporally locate abnormal events in a video. Existing works mostly rely on training deep models to learn the distribution of normality with either video-level supervision, one-class supervision, or in an unsupervised setting. Training-based methods are prone to be domain-specific, thus being costly for practical deployment as any domain change will involve data collection and model training. In this paper, we radically depart from previous efforts and propose LAnguage-based VAD (LAVAD), a method tackling VAD in a novel, training-free paradigm, exploiting the capabilities of pre-trained large language models (LLMs) and existing vision-language models (VLMs). We leverage VLM-based captioning models to generate textual descriptions for each frame of any test video. With the textual scene description, we then devise a prompting mechanism to unlock the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · COVID-19 diagnosis using AI · Network Security and Intrusion Detection