Siamese-DETR for Generic Multi-Object Tracking

Qiankun Liu; Yichen Li; Yuqi Jiang; Ying Fu

arXiv:2310.17875·cs.CV·January 8, 2025·1 cites

Siamese-DETR for Generic Multi-Object Tracking

Qiankun Liu, Yichen Li, Yuqi Jiang, Ying Fu

PDF

Open Access 1 Repo

TL;DR

Siamese-DETR introduces a simple, training-efficient approach for generic multi-object tracking that leverages object queries in DETR, eliminating complex data association and surpassing existing methods on benchmark datasets.

Contribution

The paper proposes Siamese-DETR, a novel GMOT method using object queries in DETR, trained on detection datasets, simplifying tracking and improving performance.

Findings

01

Outperforms existing GMOT methods on GMOT-40 dataset

02

Uses only detection datasets like COCO for training

03

Simplifies online tracking with query-based approach

Abstract

The ability to detect and track the dynamic objects in different scenes is fundamental to real-world applications, e.g., autonomous driving and robot navigation. However, traditional Multi-Object Tracking (MOT) is limited to tracking objects belonging to the pre-defined closed-set categories. Recently, Open-Vocabulary MOT (OVMOT) and Generic MOT (GMOT) are proposed to track interested objects beyond pre-defined categories with the given text prompt and template image. However, the expensive well pre-trained (vision-)language model and fine-grained category annotations are required to train OVMOT models. In this paper, we focus on GMOT and propose a simple but effective method, Siamese-DETR, for GMOT. Only the commonly used detection datasets (e.g., COCO) are required for training. Different from existing GMOT methods, which train a Single Object Tracking (SOT) based detector to detect…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yumu-173/siamese-detr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Video Surveillance and Tracking Methods

MethodsMulti-Head Attention · Attention Is All You Need · Label Smoothing · Linear Layer · Residual Connection · Byte Pair Encoding · Softmax · Dropout · Adam · Position-Wise Feed-Forward Layer