Cascaded Scene Flow Prediction using Semantic Segmentation
Zhile Ren, Deqing Sun, Jan Kautz, Erik B. Sudderth

TL;DR
This paper introduces a cascaded classification approach that leverages semantic segmentation to improve 3D scene flow estimation from stereo images, especially for rigidly moving objects, achieving state-of-the-art results on KITTI.
Contribution
It proposes a novel cascaded framework that iteratively refines semantic segmentation, stereo matching, and motion estimates to produce accurate scene flow in complex scenes.
Findings
Achieves state-of-the-art performance on KITTI benchmark.
Effectively models rigid motion of foreground objects.
Improves scene flow accuracy by integrating semantic cues.
Abstract
Given two consecutive frames from a pair of stereo cameras, 3D scene flow methods simultaneously estimate the 3D geometry and motion of the observed scene. Many existing approaches use superpixels for regularization, but may predict inconsistent shapes and motions inside rigidly moving objects. We instead assume that scenes consist of foreground objects rigidly moving in front of a static background, and use semantic cues to produce pixel-accurate scene flow estimates. Our cascaded classification framework accurately models 3D scenes by iteratively refining semantic segmentation masks, stereo correspondences, 3D rigid motion estimates, and optical flow fields. We evaluate our method on the challenging KITTI autonomous driving benchmark, and show that accounting for the motion of segmented vehicles leads to state-of-the-art performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Human Pose and Action Recognition
