JOSENet: A Joint Stream Embedding Network for Violence Detection in Surveillance Videos
Pietro Nardelli, Danilo Comminiello

TL;DR
JOSENet is a self-supervised deep learning framework that efficiently detects violence in surveillance videos by processing dual video streams, achieving high accuracy with lower computational costs and fewer frames.
Contribution
The paper introduces JOSENet, a novel self-supervised model that improves violence detection in surveillance videos while reducing memory and computational requirements.
Findings
Outperforms state-of-the-art methods in violence detection
Uses only one-fourth of frames per video segment
Operates at a lower frame rate with high accuracy
Abstract
The increasing proliferation of video surveillance cameras and the escalating demand for crime prevention have intensified interest in the task of violence detection within the research community. Compared to other action recognition tasks, violence detection in surveillance videos presents additional issues, such as the wide variety of real fight scenes. Unfortunately, existing datasets for violence detection are relatively small in comparison to those for other action recognition tasks. Moreover, surveillance footage often features different individuals in each video and varying backgrounds for each camera. In addition, fast detection of violent actions in real-life surveillance videos is crucial to prevent adverse outcomes, thus necessitating models that are optimized for reduced memory usage and computational costs. These challenges complicate the application of traditional action…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Anomaly Detection Techniques and Applications
