Exploiting More Information in Sparse Point Cloud for 3D Single Object   Tracking

Yubo Cui; Jiayao Shan; Zuoxu Gu; Zhiheng Li; Zheng Fang

arXiv:2210.00519·cs.CV·October 4, 2022

Exploiting More Information in Sparse Point Cloud for 3D Single Object Tracking

Yubo Cui, Jiayao Shan, Zuoxu Gu, Zhiheng Li, Zheng Fang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a transformer-based framework that converts sparse 3D point clouds into dense representations and uses attention mechanisms to improve 3D object tracking, especially in extreme sparse scenarios.

Contribution

It proposes a sparse-to-dense transformation combined with attention-based encoding for enhanced 3D tracking in sparse point clouds, addressing limitations of previous methods.

Findings

01

Achieves promising results on KITTI and NuScenes datasets.

02

Improves tracking performance in extreme sparse scenarios.

03

Utilizes multi-scale attention to compensate for information loss.

Abstract

3D single object tracking is a key task in 3D computer vision. However, the sparsity of point clouds makes it difficult to compute the similarity and locate the object, posing big challenges to the 3D tracker. Previous works tried to solve the problem and improved the tracking performance in some common scenarios, but they usually failed in some extreme sparse scenarios, such as for tracking objects at long distances or partially occluded. To address the above problems, in this letter, we propose a sparse-to-dense and transformer-based framework for 3D single object tracking. First, we transform the 3D sparse points into 3D pillars and then compress them into 2D BEV features to have a dense representation. Then, we propose an attention-based encoder to achieve global similarity computation between template and search branches, which could alleviate the influence of sparsity. Meanwhile,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

3bobo/smat
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Face recognition and analysis · Human Pose and Action Recognition