Predictive Coding Networks Meet Action Recognition
Xia Huang, Hossein Mousavi, Gemma Roig

TL;DR
This paper investigates whether predictive coding networks can improve action recognition by capturing motion information directly from video frames without relying on pre-computed optical flow, showing promising results on standard datasets.
Contribution
It introduces the use of PredNet, a predictive coding network, for action recognition to better extract motion cues directly from raw frames, eliminating the need for optical flow inputs.
Findings
PredNet effectively captures motion information from raw video frames.
The model achieves competitive accuracy on UCF101 and HMDB51 datasets.
Predictive coding networks can enhance action recognition performance without optical flow.
Abstract
Action recognition is a key problem in computer vision that labels videos with a set of predefined actions. Capturing both, semantic content and motion, along the video frames is key to achieve high accuracy performance on this task. Most of the state-of-the-art methods rely on RGB frames for extracting the semantics and pre-computed optical flow fields as a motion cue. Then, both are combined using deep neural networks. Yet, it has been argued that such models are not able to leverage the motion information extracted from the optical flow, but instead the optical flow allows for better recognition of people and objects in the video. This urges the need to explore different cues or models that can extract motion in a more informative fashion. To tackle this issue, we propose to explore the predictive coding network, so called PredNet, a recurrent neural network that propagates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods
