ScaleFlow++: Robust and Accurate Estimation of 3D Motion from Video

Han Ling; Quansen Sun

arXiv:2407.09797·cs.CV·October 17, 2024·1 cites

ScaleFlow++: Robust and Accurate Estimation of 3D Motion from Video

Han Ling, Quansen Sun

PDF

Open Access 1 Repo

TL;DR

ScaleFlow++ introduces a robust, end-to-end method for estimating 3D motion from just two RGB images, leveraging cross-scale matching to improve accuracy and generalization in various scenes.

Contribution

It proposes a novel cross-scale matching approach and an integrated architecture for joint optical flow and motion-in-depth estimation from monocular images.

Findings

01

Achieved state-of-the-art performance on KITTI dataset

02

Surpassed RGBD methods in motion-in-depth estimation

03

Exhibited excellent zero-shot generalization in diverse scenes

Abstract

Perceiving and understanding 3D motion is a core technology in fields such as autonomous driving, robots, and motion prediction. This paper proposes a 3D motion perception method called ScaleFlow++ that is easy to generalize. With just a pair of RGB images, ScaleFlow++ can robustly estimate optical flow and motion-in-depth (MID). Most existing methods directly regress MID from two RGB frames or optical flow, resulting in inaccurate and unstable results. Our key insight is cross-scale matching, which extracts deep motion clues by matching objects in pairs of images at different scales. Unlike previous methods, ScaleFlow++ integrates optical flow and MID estimation into a unified architecture, estimating optical flow and MID end-to-end based on feature matching. Moreover, we also proposed modules such as global initialization network, global iterative optimizer, and hybrid training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HanLingsgjk/CSCV
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Optical measurement and interference techniques · Image and Video Stabilization