An Information-rich Sampling Technique over Spatio-Temporal CNN for   Classification of Human Actions in Videos

S.H. Shabbeer Basha; Viswanath Pulabaigari; Snehasis Mukherjee

arXiv:2002.02100·cs.CV·February 10, 2020·6 cites

An Information-rich Sampling Technique over Spatio-Temporal CNN for Classification of Human Actions in Videos

S.H. Shabbeer Basha, Viswanath Pulabaigari, Snehasis Mukherjee

PDF

Open Access

TL;DR

This paper introduces an information-rich sampling method that aggregates consecutive video frames using Gaussian-weighted summation, combined with a 3D CNN and LSTM, to improve human action recognition in videos, especially with distant cameras.

Contribution

It proposes a novel frame aggregation sampling technique that preserves more information and enhances 3D CNN-based human action recognition performance.

Findings

01

Achieved comparable results with state-of-the-art methods on KTH and WEIZMANN datasets.

02

Demonstrated improved information retention over traditional frame sampling methods.

03

Effective in recognizing actions even with distant camera setups.

Abstract

We propose a novel scheme for human action recognition in videos, using a 3-dimensional Convolutional Neural Network (3D CNN) based classifier. Traditionally in deep learning based human activity recognition approaches, either a few random frames or every $k^{t h}$ frame of the video is considered for training the 3D CNN, where $k$ is a small positive integer, like 4, 5, or 6. This kind of sampling reduces the volume of the input data, which speeds-up training of the network and also avoids over-fitting to some extent, thus enhancing the performance of the 3D CNN model. In the proposed video sampling technique, consecutive $k$ frames of a video are aggregated into a single frame by computing a Gaussian-weighted summation of the $k$ frames. The resulting frame (aggregated frame) preserves the information in a better way than the conventional approaches and experimentally shown to perform…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods