TL;DR
This paper introduces PTT-Net, a real-time 3D single object tracker using Transformer architecture to improve feature extraction and tracking accuracy in sparse and occluded LiDAR point clouds.
Contribution
The paper proposes the Point-Track-Transformer (PTT) module and integrates it into a novel 3D SOT tracker, achieving state-of-the-art performance and real-time speed.
Findings
Surpasses baseline by ~10% in Car category on KITTI and NuScenes.
Improves tracking in sparse and occluded scenarios.
Operates at 40FPS on NVIDIA 1080Ti.
Abstract
LiDAR-based 3D single object tracking is a challenging issue in robotics and autonomous driving. Currently, existing approaches usually suffer from the problem that objects at long distance often have very sparse or partially-occluded point clouds, which makes the features extracted by the model ambiguous. Ambiguous features will make it hard to locate the target object and finally lead to bad tracking results. To solve this problem, we utilize the powerful Transformer architecture and propose a Point-Track-Transformer (PTT) module for point cloud-based 3D single object tracking task. Specifically, PTT module generates fine-tuned attention features by computing attention weights, which guides the tracker focusing on the important features of the target and improves the tracking ability in complex scenarios. To evaluate our PTT module, we embed PTT into the dominant method and construct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Softmax · Absolute Position Encodings · Adam · Position-Wise Feed-Forward Layer · Dense Connections · Label Smoothing
