Rethinking RAFT for Efficient Optical Flow
Navid Eslami, Farnoosh Arefi, Amir M. Mansourian, Shohreh Kasaei

TL;DR
This paper introduces Ef-RAFT, an improved optical flow method based on RAFT, incorporating attention mechanisms and novel operators to better handle large displacements and repetitive patterns, with notable accuracy and efficiency gains.
Contribution
It proposes a new Attention-based Feature Localization and Amorphous Lookup Operator to enhance RAFT's performance and efficiency in optical flow estimation.
Findings
Achieves 10% improvement on Sintel dataset
Achieves 5% improvement on KITTI dataset
Reduces runtime by 33% with minimal memory increase
Abstract
Despite significant progress in deep learning-based optical flow methods, accurately estimating large displacements and repetitive patterns remains a challenge. The limitations of local features and similarity search patterns used in these algorithms contribute to this issue. Additionally, some existing methods suffer from slow runtime and excessive graphic memory consumption. To address these problems, this paper proposes a novel approach based on the RAFT framework. The proposed Attention-based Feature Localization (AFL) approach incorporates the attention mechanism to handle global feature extraction and address repetitive patterns. It introduces an operator for matching pixels with corresponding counterparts in the second frame and assigning accurate flow values. Furthermore, an Amorphous Lookup Operator (ALO) is proposed to enhance convergence speed and improve RAFTs ability to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Retinal Imaging and Analysis · Image Processing Techniques and Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
