VIDI: A Video Dataset of Incidents

Duygu Sesver; Alp Eren Gen\c{c}o\u{g}lu; \c{C}a\u{g}r{\i} Emre; Y{\i}ld{\i}z; Zehra G\"unindi; Faeze Habibi; Ziya Ata Yaz{\i}c{\i}; Haz{\i}m; Kemal Ekenel

arXiv:2205.13277·cs.CV·January 10, 2023

VIDI: A Video Dataset of Incidents

Duygu Sesver, Alp Eren Gen\c{c}o\u{g}lu, \c{C}a\u{g}r{\i} Emre, Y{\i}ld{\i}z, Zehra G\"unindi, Faeze Habibi, Ziya Ata Yaz{\i}c{\i}, Haz{\i}m, Kemal Ekenel

PDF

Open Access 1 Repo

TL;DR

This paper introduces VIDI, a comprehensive video dataset of 43 incident categories, and demonstrates that using video data significantly improves incident classification accuracy over still images.

Contribution

The paper presents VIDI, a new diverse video dataset for incident detection, and benchmarks state-of-the-art models showing the benefits of video data for classification.

Findings

01

Video data improves incident classification accuracy from 67.37% to 76.56%.

02

Recent models like Vision Transformer and TimeSformer outperform previous approaches.

03

VIDI dataset will be publicly available for further research.

Abstract

Automatic detection of natural disasters and incidents has become more important as a tool for fast response. There have been many studies to detect incidents using still images and text. However, the number of approaches that exploit temporal information is rather limited. One of the main reasons for this is that a diverse video dataset with various incident types does not exist. To address this need, in this paper we present a video dataset, Video Dataset of Incidents, VIDI, that contains 4,534 video clips corresponding to 43 incident categories. Each incident class has around 100 videos with a duration of ten seconds on average. To increase diversity, the videos have been searched in several languages. To assess the performance of the recent state-of-the-art approaches, Vision Transformer and TimeSformer, as well as to explore the contribution of video-based information for incident…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vididataset/vidi
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Data-Driven Disease Surveillance · Video Analysis and Summarization

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Absolute Position Encodings · Dropout · Adam · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing