StreamFlow: Streamlined Multi-Frame Optical Flow Estimation for Video Sequences
Shangkun Sun, Jiaming Liu, Thomas H. Li, Huaxia Li, Guoqing Liu, Wei, Gao

TL;DR
StreamFlow introduces a fast, efficient multi-frame optical flow estimation method that effectively handles occlusions in video sequences by eliminating redundant computations and modeling spatio-temporal relations during encoding and decoding.
Contribution
The paper proposes a novel in-batch multi-frame optical flow framework with integrated spatio-temporal modeling modules that improve efficiency and accuracy over prior recursive methods.
Findings
Achieves similar speed to two-frame networks while utilizing multi-frame information.
Significantly improves optical flow accuracy in occluded regions.
Reduces computational complexity by 63.82% compared to previous multi-frame methods.
Abstract
Occlusions between consecutive frames have long posed a significant challenge in optical flow estimation. The inherent ambiguity introduced by occlusions directly violates the brightness constancy constraint and considerably hinders pixel-to-pixel matching. To address this issue, multi-frame optical flow methods leverage adjacent frames to mitigate the local ambiguity. Nevertheless, prior multi-frame methods predominantly adopt recursive flow estimation, resulting in a considerable computational overlap. In contrast, we propose a streamlined in-batch framework that eliminates the need for extensive redundant recursive computations while concurrently developing effective spatio-temporal modeling approaches under in-batch estimation constraints. Specifically, we present a Streamlined In-batch Multi-frame (SIM) pipeline tailored to video input, attaining a similar level of time efficiency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Image Enhancement Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
