Flow-Anything: Learning Real-World Optical Flow Estimation from Large-Scale Single-view Images

Yingping Liang; Ying Fu; Yutao Hu; Wenqi Shao; Jiaming Liu; Debing Zhang

arXiv:2506.07740·cs.CV·June 10, 2025

Flow-Anything: Learning Real-World Optical Flow Estimation from Large-Scale Single-view Images

Yingping Liang, Ying Fu, Yutao Hu, Wenqi Shao, Jiaming Liu, Debing Zhang

PDF

Open Access

TL;DR

Flow-Anything introduces a novel framework for generating large-scale real-world optical flow datasets from single-view images, improving robustness and performance in real-world applications.

Contribution

The paper presents a new data generation method that leverages monocular depth estimation and volume rendering to create realistic optical flow datasets from single-view images.

Findings

01

Outperforms existing unsupervised and supervised methods on synthetic datasets.

02

Enhances downstream video task performance.

03

Demonstrates the effectiveness of real-world data for optical flow training.

Abstract

Optical flow estimation is a crucial subfield of computer vision, serving as a foundation for video tasks. However, the real-world robustness is limited by animated synthetic datasets for training. This introduces domain gaps when applied to real-world applications and limits the benefits of scaling up datasets. To address these challenges, we propose \textbf{Flow-Anything}, a large-scale data generation framework designed to learn optical flow estimation from any single-view images in the real world. We employ two effective steps to make data scaling-up promising. First, we convert a single-view image into a 3D representation using advanced monocular depth estimation networks. This allows us to render optical flow and novel view images under a virtual camera. Second, we develop an Object-Independent Volume Rendering module and a Depth-Aware Inpainting module to model the dynamic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition

MethodsInpainting