TransFlow: Transformer as Flow Learner
Yawen Lu, Qifan Wang, Siqi Ma, Tong Geng, Yingjie Victor Chen, Huaijin, Chen, and Dongfang Liu

TL;DR
TransFlow introduces a transformer-based architecture for optical flow estimation, outperforming CNN methods by capturing global dependencies, handling occlusions better, and simplifying training procedures, achieving state-of-the-art results on multiple benchmarks.
Contribution
The paper presents TransFlow, a novel pure transformer architecture for optical flow that improves accuracy, robustness, and training simplicity over traditional CNN-based methods.
Findings
Achieves state-of-the-art results on Sintel and KITTI-15 datasets.
Effectively captures global dependencies with self-attention mechanisms.
Improves handling of occlusion and motion blur in flow estimation.
Abstract
Optical flow is an indispensable building block for various important computer vision tasks, including motion estimation, object tracking, and disparity measurement. In this work, we propose TransFlow, a pure transformer architecture for optical flow estimation. Compared to dominant CNN-based methods, TransFlow demonstrates three advantages. First, it provides more accurate correlation and trustworthy matching in flow estimation by utilizing spatial self-attention and cross-attention mechanisms between adjacent frames to effectively capture global dependencies; Second, it recovers more compromised information (e.g., occlusion and motion blur) in flow estimation through long-range temporal association in dynamic scenes; Third, it enables a concise self-learning paradigm and effectively eliminate the complex and laborious multi-stage pre-training procedures. We achieve the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Robotics and Sensor-Based Localization
MethodsSelf-Learning
