DIFEM: Key-points Interaction based Feature Extraction Module for Violence Recognition in Videos
Himanshu Mittal, Suvramalya Basak, Anjali Gautam

TL;DR
This paper introduces DIFEM, a lightweight and effective feature extraction module that uses human skeleton key-points to improve violence recognition in videos, outperforming existing methods.
Contribution
The paper presents a novel Dynamic Interaction Feature Extraction Module (DIFEM) that captures violence-related motion features efficiently, with fewer parameters than deep learning models.
Findings
DIFEM effectively captures motion dynamics like velocity and joint intersections.
The method outperforms several state-of-the-art violence recognition techniques.
Experiments on three datasets demonstrate promising results across all.
Abstract
Violence detection in surveillance videos is a critical task for ensuring public safety. As a result, there is increasing need for efficient and lightweight systems for automatic detection of violent behaviours. In this work, we propose an effective method which leverages human skeleton key-points to capture inherent properties of violence, such as rapid movement of specific joints and their close proximity. At the heart of our method is our novel Dynamic Interaction Feature Extraction Module (DIFEM) which captures features such as velocity, and joint intersections, effectively capturing the dynamics of violent behavior. With the features extracted by our DIFEM, we use various classification algorithms such as Random Forest, Decision tree, AdaBoost and k-Nearest Neighbor. Our approach has substantially lesser amount of parameter expense than the existing state-of-the-art (SOTA) methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
