Beyond Short Snippets: Deep Networks for Video Classification
Joe Yue-Hei Ng, Matthew Hausknecht, Sudheendra Vijayanarasimhan, and Oriol Vinyals, Rajat Monga, George Toderici

TL;DR
This paper introduces novel deep neural network architectures that effectively model long-term temporal dependencies in videos, significantly improving classification accuracy over previous methods on benchmark datasets.
Contribution
It proposes two new methods for video classification: convolutional temporal pooling architectures and an LSTM-based sequence modeling approach, handling full-length videos.
Findings
Achieved 73.1% accuracy on Sports 1 million dataset.
Improved UCF-101 accuracy to 88.6%.
Outperformed previous methods with and without optical flow.
Abstract
Convolutional neural networks (CNNs) have been extensively applied for image recognition problems giving state-of-the-art results on recognition, detection, segmentation and retrieval. In this work we propose and evaluate several deep neural network architectures to combine image information across a video over longer time periods than previously attempted. We propose two methods capable of handling full length videos. The first method explores various convolutional temporal feature pooling architectures, examining the various design choices which need to be made when adapting a CNN for this task. The second proposed method explicitly models the video as an ordered sequence of frames. For this purpose we employ a recurrent neural network that uses Long Short-Term Memory (LSTM) cells which are connected to the output of the underlying CNN. Our best networks exhibit significant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Analysis and Summarization · Anomaly Detection Techniques and Applications
