Convolutional Relational Machine for Group Activity Recognition

Sina Mokhtarzadeh Azar; Mina Ghadimi Atigh; Ahmad Nickabadi; Alexandre; Alahi

arXiv:1904.03308·cs.CV·April 9, 2019·5 cites

Convolutional Relational Machine for Group Activity Recognition

Sina Mokhtarzadeh Azar, Mina Ghadimi Atigh, Ahmad Nickabadi, Alexandre, Alahi

PDF

Open Access

TL;DR

This paper introduces CRM, a deep neural network that captures spatial relations among individuals to improve group activity recognition in images and videos, outperforming existing models.

Contribution

The paper proposes an end-to-end CNN with an activity map and multi-stage refinement for better group activity recognition, demonstrating superior performance on benchmark datasets.

Findings

01

CRM outperforms state-of-the-art models on Volleyball dataset.

02

CRM effectively utilizes spatial relations for activity recognition.

03

The activity map and refinement improve prediction accuracy.

Abstract

We present an end-to-end deep Convolutional Neural Network called Convolutional Relational Machine (CRM) for recognizing group activities that utilizes the information in spatial relations between individual persons in image or video. It learns to produce an intermediate spatial representation (activity map) based on individual and group activities. A multi-stage refinement component is responsible for decreasing the incorrect predictions in the activity map. Finally, an aggregation component uses the refined information to recognize group activities. Experimental results demonstrate the constructive contribution of the information extracted and represented in the form of the activity map. CRM shows advantages over state-of-the-art models on Volleyball and Collective Activity datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Analysis and Summarization · Anomaly Detection Techniques and Applications