Tracking by Associating Clips

Sanghyun Woo; Kwanyong Park; Seoung Wug Oh; In So Kweon; Joon-Young; Lee

arXiv:2212.10149·cs.CV·December 21, 2022

Tracking by Associating Clips

Sanghyun Woo, Kwanyong Park, Seoung Wug Oh, In So Kweon, Joon-Young, Lee

PDF

Open Access

TL;DR

This paper proposes a clip-wise matching approach for multi-object tracking, improving robustness to interruptions and enhancing long-range association by leveraging short video clips instead of frame-by-frame matching.

Contribution

It introduces a novel clip-wise matching framework that mitigates error propagation and utilizes multi-frame information for better long-term tracking.

Findings

01

Improved tracking accuracy on TAO and MOT17 benchmarks.

02

Enhanced robustness to occlusions and abrupt scene changes.

03

Better long-range association compared to traditional frame-wise methods.

Abstract

The tracking-by-detection paradigm today has become the dominant method for multi-object tracking and works by detecting objects in each frame and then performing data association across frames. However, its sequential frame-wise matching property fundamentally suffers from the intermediate interruptions in a video, such as object occlusions, fast camera movements, and abrupt light changes. Moreover, it typically overlooks temporal information beyond the two frames for matching. In this paper, we investigate an alternative by treating object association as clip-wise matching. Our new perspective views a single long video sequence as multiple short clips, and then the tracking is performed both within and between the clips. The benefits of this new approach are two folds. First, our method is robust to tracking error accumulation or propagation, as the video chunking allows bypassing the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Image and Video Quality Assessment · Human Pose and Action Recognition

MethodsContrastive Language-Image Pre-training