SceneTracker: Long-term Scene Flow Estimation Network
Bo Wang, Jian Li, Yang Yu, Li Liu, Zhenping Sun, Dewen Hu

TL;DR
SceneTracker is a novel network for long-term scene flow estimation that captures detailed 3D motion over time, addressing occlusion and noise, and demonstrating strong generalization on a new real-world dataset.
Contribution
It introduces SceneTracker, the first iterative LSFE network using transformers for long-range trajectory modeling and a new dataset for real-world evaluation.
Findings
Outperforms existing methods in occlusion handling
Shows strong generalization on LSFDriving dataset
Effectively models long-term 3D trajectories
Abstract
Considering that scene flow estimation has the capability of the spatial domain to focus but lacks the coherence of the temporal domain, this study proposes long-term scene flow estimation (LSFE), a comprehensive task that can simultaneously capture the fine-grained and long-term 3D motion in an online manner. We introduce SceneTracker, the first LSFE network that adopts an iterative approach to approximate the optimal 3D trajectory. The network dynamically and simultaneously indexes and constructs appearance correlation and depth residual features. Transformers are then employed to explore and utilize long-range connections within and between trajectories. With detailed experiments, SceneTracker shows superior capabilities in addressing 3D spatial occlusion and depth noise interference, highly tailored to the needs of the LSFE task. We build a real-world evaluation dataset, LSFDriving,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Advanced Image Processing Techniques
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Multi-Head Attention · Softmax · Dense Connections · Label Smoothing · Adam · Absolute Position Encodings
