An Information-rich Sampling Technique over Spatio-Temporal CNN for Classification of Human Actions in Videos
S.H. Shabbeer Basha, Viswanath Pulabaigari, Snehasis Mukherjee

TL;DR
This paper introduces an information-rich sampling method that aggregates consecutive video frames using Gaussian-weighted summation, combined with a 3D CNN and LSTM, to improve human action recognition in videos, especially with distant cameras.
Contribution
It proposes a novel frame aggregation sampling technique that preserves more information and enhances 3D CNN-based human action recognition performance.
Findings
Achieved comparable results with state-of-the-art methods on KTH and WEIZMANN datasets.
Demonstrated improved information retention over traditional frame sampling methods.
Effective in recognizing actions even with distant camera setups.
Abstract
We propose a novel scheme for human action recognition in videos, using a 3-dimensional Convolutional Neural Network (3D CNN) based classifier. Traditionally in deep learning based human activity recognition approaches, either a few random frames or every frame of the video is considered for training the 3D CNN, where is a small positive integer, like 4, 5, or 6. This kind of sampling reduces the volume of the input data, which speeds-up training of the network and also avoids over-fitting to some extent, thus enhancing the performance of the 3D CNN model. In the proposed video sampling technique, consecutive frames of a video are aggregated into a single frame by computing a Gaussian-weighted summation of the frames. The resulting frame (aggregated frame) preserves the information in a better way than the conventional approaches and experimentally shown to perform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods
