3D Object Tracking with Transformer

Yubo Cui; Zheng Fang; Jiayao Shan; Zuoxu Gu; Sifan Zhou

arXiv:2110.14921·cs.CV·October 29, 2021·32 cites

3D Object Tracking with Transformer

Yubo Cui, Zheng Fang, Jiayao Shan, Zuoxu Gu, Sifan Zhou

PDF

Open Access 1 Repo

TL;DR

This paper introduces a transformer-based feature fusion network for 3D object tracking in point clouds, leveraging self- and cross-attention mechanisms to improve similarity computation and achieve state-of-the-art results on KITTI.

Contribution

It presents a novel transformer architecture for feature fusion in 3D object tracking, enhancing similarity computation and tracking accuracy.

Findings

01

Achieves state-of-the-art performance on KITTI dataset.

02

Effective use of self- and cross-attention in point cloud feature fusion.

03

End-to-end framework simplifies 3D object tracking pipeline.

Abstract

Feature fusion and similarity computation are two core problems in 3D object tracking, especially for object tracking using sparse and disordered point clouds. Feature fusion could make similarity computing more efficient by including target object information. However, most existing LiDAR-based approaches directly use the extracted point cloud feature to compute similarity while ignoring the attention changes of object regions during tracking. In this paper, we propose a feature fusion network based on transformer architecture. Benefiting from the self-attention mechanism, the transformer encoder captures the inter- and intra- relations among different regions of the point cloud. By using cross-attention, the transformer decoder fuses features and includes more target cues into the current point cloud feature to compute the region attentions, which makes the similarity computing more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

3bobo/lttr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Infrared Thermography in Medicine