OST: Efficient One-stream Network for 3D Single Object Tracking in Point Clouds
Xiantong Zhao, Yinan Han, Shengjing Tian, Jian Liu, Xiuping Liu

TL;DR
This paper introduces OST, a novel one-stream network for 3D single object tracking in point clouds that reduces computation by avoiding correlation operations and effectively fuses spatial and semantic features.
Contribution
The paper proposes a new one-stream architecture with a Template-aware Transformer Module and Multi-scale Feature Aggregation, improving efficiency and generalization in 3D object tracking.
Findings
Achieves high accuracy on KITTI and nuScenes datasets.
Reduces computational effort compared to Siamese networks.
Effective for both class-specific and class-agnostic tracking.
Abstract
Although recent Siamese network-based trackers have achieved impressive perceptual accuracy for single object tracking in LiDAR point clouds, they usually utilized heavy correlation operations to capture category-level characteristics only, and overlook the inherent merit of arbitrariness in contrast to multiple object tracking. In this work, we propose a radically novel one-stream network with the strength of the instance-level encoding, which avoids the correlation operations occurring in previous Siamese network, thus considerably reducing the computational effort. In particular, the proposed method mainly consists of a Template-aware Transformer Module (TTM) and a Multi-scale Feature Aggregation (MFA) module capable of fusing spatial and semantic information. The TTM stitches the specified template and the search region together and leverages an attention mechanism to establish the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Optical Imaging and Spectroscopy Techniques · Infrared Thermography in Medicine
MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Linear Layer · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Layer Normalization · Residual Connection · Dropout
