On Encoding Temporal Evolution for Real-time Action Prediction
Fahimeh Rezazadegan, Sareh Shirazi, Mahsa Baktashmotlagh, Larry S., Davis

TL;DR
This paper introduces a method for real-time action prediction by forecasting future motion in videos using dynamic images and convolutional LSTMs, achieving high accuracy on benchmark datasets.
Contribution
It presents a novel approach that predicts future motion evolution in videos through unsupervised learning of dynamic images, enhancing real-time action anticipation.
Findings
Outperforms state-of-the-art methods on benchmark datasets
Effective in complex activity scenarios
Accurate anticipation of next human actions
Abstract
Anticipating future actions is a key component of intelligence, specifically when it applies to real-time systems, such as robots or autonomous cars. While recent works have addressed prediction of raw RGB pixel values, we focus on anticipating the motion evolution in future video frames. To this end, we construct dynamic images (DIs) by summarising moving pixels through a sequence of future frames. We train a convolutional LSTMs to predict the next DIs based on an unsupervised learning process, and then recognise the activity associated with the predicted DI. We demonstrate the effectiveness of our approach on 3 benchmark action datasets showing that despite running on videos with complex activities, our approach is able to anticipate the next human action with high accuracy and obtain better results than the state-of-the-art methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · Anomaly Detection Techniques and Applications
