FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

Eddy Ilg; Nikolaus Mayer; Tonmoy Saikia; Margret Keuper; Alexey; Dosovitskiy; Thomas Brox

arXiv:1612.01925·cs.CV·December 7, 2016·50 cites

FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey, Dosovitskiy, Thomas Brox

PDF

Open Access 5 Repos

TL;DR

FlowNet 2.0 significantly improves deep learning-based optical flow estimation by enhancing training strategies, architecture, and small displacement handling, achieving state-of-the-art accuracy at real-time speeds.

Contribution

We introduce a stacked architecture with warping and a specialized small displacement sub-network, along with optimized training data scheduling, to greatly enhance optical flow estimation quality.

Findings

01

Reduces estimation error by over 50%

02

Achieves real-time processing at up to 140fps

03

Performs on par with traditional state-of-the-art methods

Abstract

The FlowNet demonstrated that optical flow estimation can be cast as a learning problem. However, the state of the art with regard to the quality of the flow has still been defined by traditional methods. Particularly on small displacements and real-world data, FlowNet cannot compete with variational methods. In this paper, we advance the concept of end-to-end learning of optical flow and make it work really well. The large improvements in quality and speed are caused by three major contributions: first, we focus on the training data and show that the schedule of presenting data during training is very important. Second, we develop a stacked architecture that includes warping of the second image with intermediate optical flow. Third, we elaborate on small displacements by introducing a sub-network specializing on small motions. FlowNet 2.0 is only marginally slower than the original…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Image Enhancement Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings