Deep Learning for Vision-based Prediction: A Survey
Amir Rasouli

TL;DR
This survey reviews recent deep learning methods for vision-based prediction across various applications, highlighting architectures, datasets, and evaluation metrics used in the field over the past five years.
Contribution
It categorizes vision-based prediction tasks, summarizes common deep learning architectures, datasets, and evaluation metrics, and provides an organized online database of this information.
Findings
Deep learning dominates recent vision-based prediction methods.
Various architectures are tailored for specific prediction tasks.
Standard datasets and metrics are identified for benchmarking.
Abstract
Vision-based prediction algorithms have a wide range of applications including autonomous driving, surveillance, human-robot interaction, weather prediction. The objective of this paper is to provide an overview of the field in the past five years with a particular focus on deep learning approaches. For this purpose, we categorize these algorithms into video prediction, action prediction, trajectory prediction, body motion prediction, and other prediction applications. For each category, we highlight the common architectures, training methods and types of data used. In addition, we discuss the common evaluation metrics and datasets used for vision-based prediction tasks. A database of all the information presented in this survey including, cross-referenced according to papers, datasets and metrics, can be found online at https://github.com/aras62/vision-based-prediction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications
