Context-LSTM: a robust classifier for video detection on UCF101

Dengshan Li; Rujing Wang

arXiv:2203.06610·cs.CV·March 15, 2022·5 cites

Context-LSTM: a robust classifier for video detection on UCF101

Dengshan Li, Rujing Wang

PDF

Open Access

TL;DR

This paper introduces Context-LSTM, a simplified yet effective LSTM-based model for video detection that reduces training time and GPU memory usage while maintaining high accuracy on UCF101.

Contribution

The paper proposes a novel LSTM-based architecture called Context-LSTM that is computationally efficient and achieves competitive accuracy in video detection tasks.

Findings

01

Reduces training time and GPU memory usage

02

Achieves top-1 accuracy comparable to state-of-the-art methods

03

Demonstrates robust performance on UCF101 dataset

Abstract

Video detection and human action recognition may be computationally expensive, and need a long time to train models. In this paper, we were intended to reduce the training time and the GPU memory usage of video detection, and achieved a competitive detection accuracy. Other research works such as Two-stream, C3D, TSN have shown excellent performance on UCF101. Here, we used a LSTM structure simply for video detection. We used a simple structure to perform a competitive top-1 accuracy on the entire validation dataset of UCF101. The LSTM structure is named Context-LSTM, since it may process the deep temporal features. The Context-LSTM may simulate the human recognition system. We cascaded the LSTM blocks in PyTorch and connected the cell state flow and hidden output flow. At the connection of the blocks, we used ReLU, Batch Normalization, and MaxPooling functions. The Context-LSTM could…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Advanced Neural Network Applications · Human Pose and Action Recognition

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Batch Normalization