The Surprising Effectiveness of Diffusion Models for Optical Flow and Monocular Depth Estimation
Saurabh Saxena, Charles Herrmann, Junhwa Hur, Abhishek Kar, Mohammad, Norouzi, Deqing Sun, David J. Fleet

TL;DR
This paper demonstrates that diffusion models, traditionally used for image generation, are highly effective for optical flow and monocular depth estimation, outperforming existing methods and enabling uncertainty quantification.
Contribution
The authors introduce DDVM, a diffusion-based model for depth and optical flow estimation that achieves state-of-the-art results without task-specific architectures or loss functions.
Findings
Achieves 0.074 depth error on NYU benchmark
Attains 3.26% outlier rate on KITTI optical flow
Outperforms previous methods by about 25% in accuracy
Abstract
Denoising diffusion probabilistic models have transformed image generation with their impressive fidelity and diversity. We show that they also excel in estimating optical flow and monocular depth, surprisingly, without task-specific architectures and loss functions that are predominant for these tasks. Compared to the point estimates of conventional regression-based methods, diffusion models also enable Monte Carlo inference, e.g., capturing uncertainty and ambiguity in flow and depth. With self-supervised pre-training, the combined use of synthetic and real data for supervised training, and technical innovations (infilling and step-unrolled denoising diffusion training) to handle noisy-incomplete training data, and a simple form of coarse-to-fine refinement, one can train state-of-the-art diffusion models for depth and optical flow estimation. Extensive experiments focus on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Cell Image Analysis Techniques
MethodsDiffusion · Focus
