A Benchmark for Crime Surveillance Video Analysis with Large Models

Haoran Chen; Dong Yi; Moyan Cao; Chensen Huang; Guibo Zhu; Jinqiao; Wang

arXiv:2502.09325·cs.CV·February 14, 2025

A Benchmark for Crime Surveillance Video Analysis with Large Models

Haoran Chen, Dong Yi, Moyan Cao, Chensen Huang, Guibo Zhu, Jinqiao, Wang

PDF

Open Access

TL;DR

This paper introduces a new benchmark dataset, UCVL, for crime surveillance video analysis using large multimodal language models, and evaluates their performance with detailed assessments and fine-tuning.

Contribution

It provides a comprehensive benchmark with diverse QA pairs and assessment methods for MLLMs in crime video analysis, filling a gap in current evaluation standards.

Findings

01

MLLMs show varying performance on the benchmark

02

Fine-tuning improves model accuracy in anomaly detection

03

The benchmark is reliable for evaluating large models' capabilities

Abstract

Anomaly analysis in surveillance videos is a crucial topic in computer vision. In recent years, multimodal large language models (MLLMs) have outperformed task-specific models in various domains. Although MLLMs are particularly versatile, their abilities to understand anomalous concepts and details are insufficiently studied because of the outdated benchmarks of this field not providing MLLM-style QAs and efficient algorithms to assess the model's open-ended text responses. To fill this gap, we propose a benchmark for crime surveillance video analysis with large models denoted as UCVL, including 1,829 videos and reorganized annotations from the UCF-Crime and UCF-Crime Annotation datasets. We design six types of questions and generate diverse QA pairs. Then we develop detailed instructions and use OpenAI's GPT-4o for accurate assessment. We benchmark eight prevailing MLLMs ranging from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Digital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis