DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild
Weicai Ye, Xinyu Chen, Ruohao Zhan, Di Huang, Xiaoshui Huang, Haoyi, Zhu, Hujun Bao, Wanli Ouyang, Tong He, Guofeng Zhang

TL;DR
This paper introduces DATAP, a dynamic-aware point tracking method that improves camera trajectory estimation and dense point cloud reconstruction in challenging dynamic videos by leveraging consistent depth and global optimization.
Contribution
The paper presents a novel dynamic-aware tracking approach that enhances structure from motion in dynamic scenes by integrating consistent depth and global bundle adjustment, reducing errors from traditional methods.
Findings
Achieves state-of-the-art camera pose estimation in dynamic scenes
Effective in complex real-world videos like DAVIS
Improves robustness over traditional optical flow-based methods
Abstract
This paper proposes a concise, elegant, and robust pipeline to estimate smooth camera trajectories and obtain dense point clouds for casual videos in the wild. Traditional frameworks, such as ParticleSfM~\cite{zhao2022particlesfm}, address this problem by sequentially computing the optical flow between adjacent frames to obtain point trajectories. They then remove dynamic trajectories through motion segmentation and perform global bundle adjustment. However, the process of estimating optical flow between two adjacent frames and chaining the matches can introduce cumulative errors. Additionally, motion segmentation combined with single-view depth estimation often faces challenges related to scale ambiguity. To tackle these challenges, we propose a dynamic-aware tracking any point (DATAP) method that leverages consistent video depth and point tracking. Specifically, our DATAP addresses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · 1x1 Convolution · Convolution · Thinned U-shape Module
