Convolutional Relational Machine for Group Activity Recognition
Sina Mokhtarzadeh Azar, Mina Ghadimi Atigh, Ahmad Nickabadi, Alexandre, Alahi

TL;DR
This paper introduces CRM, a deep neural network that captures spatial relations among individuals to improve group activity recognition in images and videos, outperforming existing models.
Contribution
The paper proposes an end-to-end CNN with an activity map and multi-stage refinement for better group activity recognition, demonstrating superior performance on benchmark datasets.
Findings
CRM outperforms state-of-the-art models on Volleyball dataset.
CRM effectively utilizes spatial relations for activity recognition.
The activity map and refinement improve prediction accuracy.
Abstract
We present an end-to-end deep Convolutional Neural Network called Convolutional Relational Machine (CRM) for recognizing group activities that utilizes the information in spatial relations between individual persons in image or video. It learns to produce an intermediate spatial representation (activity map) based on individual and group activities. A multi-stage refinement component is responsible for decreasing the incorrect predictions in the activity map. Finally, an aggregation component uses the refined information to recognize group activities. Experimental results demonstrate the constructive contribution of the information extracted and represented in the form of the activity map. CRM shows advantages over state-of-the-art models on Volleyball and Collective Activity datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Analysis and Summarization · Anomaly Detection Techniques and Applications
