A Comprehensive Study of Deep Video Action Recognition
Yi Zhu, Xinyu Li, Chunhui Liu, Mohammadreza Zolfaghari, Yuanjun Xiong,, Chongruo Wu, Zhi Zhang, Joseph Tighe, R. Manmatha, Mu Li

TL;DR
This paper provides a comprehensive survey of deep learning methods for video action recognition, covering datasets, model evolution, benchmarking, and future challenges in the field.
Contribution
It offers an extensive review of over 200 papers, benchmarks key models, and discusses open problems to guide future research in deep video action recognition.
Findings
Benchmarking of popular methods on key datasets
Analysis of model evolution over time
Discussion of open challenges and future directions
Abstract
Video action recognition is one of the representative tasks for video understanding. Over the last decade, we have witnessed great advancements in video action recognition thanks to the emergence of deep learning. But we also encountered new challenges, including modeling long-range temporal information in videos, high computation costs, and incomparable results due to datasets and evaluation protocol variances. In this paper, we provide a comprehensive survey of over 200 existing papers on deep learning for video action recognition. We first introduce the 17 video action recognition datasets that influenced the design of models. Then we present video action recognition models in chronological order: starting with early attempts at adapting deep learning, then to the two-stream networks, followed by the adoption of 3D convolutional kernels, and finally to the recent compute-efficient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods
