MOT FCG++: Enhanced Representation of Spatio-temporal Motion and   Appearance Features

Yanzhao Fang

arXiv:2411.10028·cs.CV·November 22, 2024

MOT FCG++: Enhanced Representation of Spatio-temporal Motion and Appearance Features

Yanzhao Fang

PDF

Open Access

TL;DR

This paper introduces MOT FCG++, a novel multi-object tracking method that enhances spatial-temporal motion and appearance feature representations, leading to improved tracking accuracy and robustness across multiple datasets.

Contribution

It proposes Diagonal Modulated GIoU and Mean Constant Velocity Modeling for better motion representation, and a dynamic appearance feature that incorporates confidence, advancing the state-of-the-art in MOT.

Findings

01

Achieved 63.1 HOTA on MOT17 test set.

02

Improved MOTA and IDF1 scores over baseline.

03

Performed competitively on MOT20 and DanceTrack datasets.

Abstract

The goal of multi-object tracking (MOT) is to detect and track all objects in a scene across frames, while maintaining a unique identity for each object. Most existing methods rely on the spatial-temporal motion features and appearance embedding features of the detected objects in consecutive frames. Effectively and robustly representing the spatial and appearance features of long trajectories has become a critical factor affecting the performance of MOT. We propose a novel approach for appearance and spatial-temporal motion feature representation, improving upon the hierarchical clustering association method MOT FCG. For spatialtemporal motion features, we first propose Diagonal Modulated GIoU, which more accurately represents the relationship between the position and shape of the objects. Second, Mean Constant Velocity Modeling is proposed to reduce the effect of observation noise on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis