GMA3D: Local-Global Attention Learning to Estimate Occluded Motions of   Scene Flow

Zhiyang Lu; Ming Cheng

arXiv:2210.03296·cs.CV·July 25, 2023

GMA3D: Local-Global Attention Learning to Estimate Occluded Motions of Scene Flow

Zhiyang Lu, Ming Cheng

PDF

Open Access 1 Repo

TL;DR

GMA3D introduces a transformer-based module that leverages local and global semantic similarities to effectively estimate scene flow in occluded 3D point clouds, improving accuracy especially in real-world scenarios.

Contribution

This paper is the first to apply transformer architecture to address occlusion in scene flow estimation for point clouds, utilizing semantic self-similarity and motion consistency.

Findings

01

Achieved state-of-the-art results on the KITTI dataset for occluded scene flow.

02

Demonstrated effectiveness of GMA3D on non-occluded datasets like FlyThings3D.

03

Improved scene flow estimation accuracy in real-world occlusion scenarios.

Abstract

Scene flow represents the motion information of each point in the 3D point clouds. It is a vital downstream method applied to many tasks, such as motion segmentation and object tracking. However, there are always occlusion points between two consecutive point clouds, whether from the sparsity data sampling or real-world occlusion. In this paper, we focus on addressing occlusion issues in scene flow by the semantic self-similarity and motion consistency of the moving objects. We propose a GMA3D module based on the transformer framework, which utilizes local and global semantic similarity to infer the motion information of occluded points from the motion information of local and global non-occluded points respectively, and then uses an offset aggregator to aggregate them. Our module is the first to apply the transformer-based architecture to gauge the scene flow occlusion problem on point…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

o-vigia/gma3d
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Human Motion and Animation