Real-time 3D Single Object Tracking with Transformer

Jiayao Shan; Sifan Zhou; Yubo Cui; Zheng Fang

arXiv:2209.00860·cs.CV·September 5, 2022

Real-time 3D Single Object Tracking with Transformer

Jiayao Shan, Sifan Zhou, Yubo Cui, Zheng Fang

PDF

1 Repo

TL;DR

This paper introduces PTT-Net, a real-time 3D single object tracker using Transformer architecture to improve feature extraction and tracking accuracy in sparse and occluded LiDAR point clouds.

Contribution

The paper proposes the Point-Track-Transformer (PTT) module and integrates it into a novel 3D SOT tracker, achieving state-of-the-art performance and real-time speed.

Findings

01

Surpasses baseline by ~10% in Car category on KITTI and NuScenes.

02

Improves tracking in sparse and occluded scenarios.

03

Operates at 40FPS on NVIDIA 1080Ti.

Abstract

LiDAR-based 3D single object tracking is a challenging issue in robotics and autonomous driving. Currently, existing approaches usually suffer from the problem that objects at long distance often have very sparse or partially-occluded point clouds, which makes the features extracted by the model ambiguous. Ambiguous features will make it hard to locate the target object and finally lead to bad tracking results. To solve this problem, we utilize the powerful Transformer architecture and propose a Point-Track-Transformer (PTT) module for point cloud-based 3D single object tracking task. Specifically, PTT module generates fine-tuned attention features by computing attention weights, which guides the tracker focusing on the important features of the target and improves the tracking ability in complex scenarios. To evaluate our PTT module, we embed PTT into the dominant method and construct…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shanjiayao/ptt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Softmax · Absolute Position Encodings · Adam · Position-Wise Feed-Forward Layer · Dense Connections · Label Smoothing