Predicting Scene Parsing and Motion Dynamics in the Future

Xiaojie Jin; Huaxin Xiao; Xiaohui Shen; Jimei Yang; Zhe Lin; Yunpeng; Chen; Zequn Jie; Jiashi Feng; Shuicheng Yan

arXiv:1711.03270·cs.CV·November 10, 2017·49 cites

Predicting Scene Parsing and Motion Dynamics in the Future

Xiaojie Jin, Huaxin Xiao, Xiaohui Shen, Jimei Yang, Zhe Lin, Yunpeng, Chen, Zequn Jie, Jiashi Feng, Shuicheng Yan

PDF

Open Access

TL;DR

This paper introduces a novel joint model for predicting future scene parsing and optical flow in videos, enhancing understanding of scene dynamics for autonomous systems.

Contribution

It is the first model to jointly predict scene parsing and motion dynamics, leveraging their mutual benefits for improved accuracy.

Findings

01

Significantly better parsing and motion prediction results on Cityscapes dataset.

02

Effective joint modeling of scene semantics and motion improves future frame understanding.

03

Model can predict vehicle steering angles, demonstrating understanding of scene dynamics.

Abstract

The ability of predicting the future is important for intelligent systems, e.g. autonomous vehicles and robots to plan early and make decisions accordingly. Future scene parsing and optical flow estimation are two key tasks that help agents better understand their environments as the former provides dense semantic information, i.e. what objects will be present and where they will appear, while the latter provides dense motion information, i.e. how the objects will move. In this paper, we propose a novel model to simultaneously predict scene parsing and optical flow in unobserved future video frames. To our best knowledge, this is the first attempt in jointly predicting scene parsing and motion dynamics. In particular, scene parsing enables structured motion prediction by decomposing optical flow into different groups while optical flow estimation brings reliable pixel-wise…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis