ConsistencyTrack: A Robust Multi-Object Tracker with a Generation   Strategy of Consistency Model

Lifan Jiang; Zhihui Wang; Siqi Yin; Guangxiao Ma; Peng Zhang; Boxi Wu

arXiv:2408.15548·cs.CV·August 29, 2024

ConsistencyTrack: A Robust Multi-Object Tracker with a Generation Strategy of Consistency Model

Lifan Jiang, Zhihui Wang, Siqi Yin, Guangxiao Ma, Peng Zhang, Boxi Wu

PDF

Open Access 1 Repo

TL;DR

ConsistencyTrack introduces a diffusion-based joint detection and tracking framework that enhances noise resistance and reduces ID switches in multi-object tracking, achieving superior performance and inference speed on standard datasets.

Contribution

The paper presents a novel diffusion process-based joint detection and tracking framework with an innovative target association strategy for robust multi-object tracking.

Findings

01

Outperforms existing methods on MOT17 and DanceTrack datasets.

02

Achieves better inference speed than DiffusionTrack.

03

Significantly improves noise resistance and reduces ID switches.

Abstract

Multi-object tracking (MOT) is a critical technology in computer vision, designed to detect multiple targets in video sequences and assign each target a unique ID per frame. Existed MOT methods excel at accurately tracking multiple objects in real-time across various scenarios. However, these methods still face challenges such as poor noise resistance and frequent ID switches. In this research, we propose a novel ConsistencyTrack, joint detection and tracking(JDT) framework that formulates detection and association as a denoising diffusion process on perturbed bounding boxes. This progressive denoising strategy significantly improves the model's noise resistance. During the training phase, paired object boxes within two adjacent frames are diffused from ground-truth boxes to a random distribution, and then the model learns to detect and track by reversing this process. In inference, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tankowa/consistencytrack
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Management and Algorithms

MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings