VidHarm: A Clip Based Dataset for Harmful Content Detection
Johan Edstedt, Amanda Berg, Michael Felsberg, Johan Karlsson,, Francisca Benavente, Anette Novak, Gustav Grund Pihlgren

TL;DR
VidHarm introduces a professionally annotated video dataset for harmful content detection, demonstrating that multimodal audiovisual models with pre-training and balanced sampling significantly improve detection performance.
Contribution
The paper presents VidHarm, a new open dataset of professionally labeled video clips, and provides insights into effective modeling strategies for harmful content detection.
Findings
Combining visual and audio data improves detection accuracy.
Pre-training on large-scale datasets enhances model performance.
Class balanced sampling reduces bias in detection results.
Abstract
Automatically identifying harmful content in video is an important task with a wide range of applications. However, there is a lack of professionally labeled open datasets available. In this work VidHarm, an open dataset of 3589 video clips from film trailers annotated by professionals, is presented. An analysis of the dataset is performed, revealing among other things the relation between clip and trailer level annotations. Audiovisual models are trained on the dataset and an in-depth study of modeling choices conducted. The results show that performance is greatly improved by combining the visual and audio modality, pre-training on large-scale video recognition datasets, and class balanced sampling. Lastly, biases of the trained models are investigated using discrimination probing. VidHarm is openly available, and further details are available at: https://vidharm.github.io
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Anomaly Detection Techniques and Applications · Video Analysis and Summarization
