Visual Forecasting by Imitating Dynamics in Natural Sequences
Kuo-Hao Zeng, William B. Shen, De-An Huang, Min Sun, Juan Carlos, Niebles

TL;DR
This paper presents a novel deep IRL-based framework for visual forecasting that directly imitates natural sequence dynamics from raw pixels across multiple semantic levels, outperforming existing methods.
Contribution
It introduces a scalable IRL approach with deep feature reparametrization for high-dimensional visual sequence imitation without domain-specific supervision.
Findings
Outperforms existing methods at multiple semantic levels.
Effectively models high-dimensional visual dynamics.
Enables scalable imitation of raw pixel sequences.
Abstract
We introduce a general framework for visual forecasting, which directly imitates visual sequences without additional supervision. As a result, our model can be applied at several semantic levels and does not require any domain knowledge or handcrafted features. We achieve this by formulating visual forecasting as an inverse reinforcement learning (IRL) problem, and directly imitate the dynamics in natural sequences from their raw pixel values. The key challenge is the high-dimensional and continuous state-action space that prohibits the application of previous IRL algorithms. We address this computational bottleneck by extending recent progress in model-free imitation with trainable deep feature representations, which (1) bypasses the exhaustive state-action pair visits in dynamic programming by using a dual formulation and (2) avoids explicit state sampling at gradient computation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Visual Forecasting by Imitating Dynamics in Natural Sequences· youtube
Taxonomy
TopicsAdvanced Vision and Imaging · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications
