GMSF: Global Matching Scene Flow
Yushan Zhang, Johan Edstedt, Bastian Wandt, Per-Erik Forss\'en, Maria, Magnusson, Michael Felsberg

TL;DR
This paper introduces GMSF, a novel single-scale global matching approach for scene flow estimation from point clouds, utilizing a hybrid transformer architecture to achieve state-of-the-art accuracy efficiently.
Contribution
It proposes a simple, effective one-shot global matching method with a hybrid transformer for feature extraction, outperforming complex multi-stage methods in scene flow estimation.
Findings
Sets new state-of-the-art on multiple benchmarks.
Significantly reduces outlier percentage on FlyingThings3D.
Achieves superior performance on KITTI and Waymo datasets.
Abstract
We tackle the task of scene flow estimation from point clouds. Given a source and a target point cloud, the objective is to estimate a translation from each point in the source point cloud to the target, resulting in a 3D motion vector field. Previous dominant scene flow estimation methods require complicated coarse-to-fine or recurrent architectures as a multi-stage refinement. In contrast, we propose a significantly simpler single-scale one-shot global matching to address the problem. Our key finding is that reliable feature similarity between point pairs is essential and sufficient to estimate accurate scene flow. We thus propose to decompose the feature extraction step via a hybrid local-global-cross transformer architecture which is crucial to accurate and robust feature representations. Extensive experiments show that the proposed Global Matching Scene Flow (GMSF) sets a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Human Pose and Action Recognition
