Spatio-temporal Graph Learning on Adaptive Mined Key Frames for   High-performance Multi-Object Tracking

Futian Wang; Fengxiang Liu; Xiao Wang

arXiv:2501.10129·cs.CV·January 20, 2025

Spatio-temporal Graph Learning on Adaptive Mined Key Frames for High-performance Multi-Object Tracking

Futian Wang, Fengxiang Liu, Xiao Wang

PDF

Open Access

TL;DR

This paper introduces a novel multi-object tracking method that uses adaptive key frame mining and intra-frame graph-based feature fusion to improve tracking accuracy, especially under occlusion conditions.

Contribution

It proposes a reinforcement learning-based key frame extraction and a graph convolutional network for intra-frame feature fusion, enhancing object association and occlusion handling in tracking.

Findings

01

Achieves 68.6 HOTA on MOT17 dataset

02

Improves IDF1 to 81.0, reducing ID switches

03

Outperforms existing methods in occlusion scenarios

Abstract

In the realm of multi-object tracking, the challenge of accurately capturing the spatial and temporal relationships between objects in video sequences remains a significant hurdle. This is further complicated by frequent occurrences of mutual occlusions among objects, which can lead to tracking errors and reduced performance in existing methods. Motivated by these challenges, we propose a novel adaptive key frame mining strategy that addresses the limitations of current tracking approaches. Specifically, we introduce a Key Frame Extraction (KFE) module that leverages reinforcement learning to adaptively segment videos, thereby guiding the tracker to exploit the intrinsic logic of the video content. This approach allows us to capture structured spatial relationships between different objects as well as the temporal relationships of objects across frames. To tackle the issue of object…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Fire Detection and Safety Systems · IoT-based Smart Home Systems

MethodsFocus