Multi-Object Tracking and Segmentation via Neural Message Passing
Guillem Braso, Orcun Cetintas, Laura Leal-Taixe

TL;DR
This paper introduces a neural message passing framework that operates on graph representations for multi-object tracking and segmentation, enabling global reasoning and joint prediction of associations and masks, achieving state-of-the-art results.
Contribution
It presents a fully differentiable, graph-based approach using Message Passing Networks for joint MOT and MOTS, leveraging global reasoning and contextual features.
Findings
Achieved state-of-the-art results on multiple datasets.
Effectively jointly predicts data association and segmentation masks.
Operates directly on graph domain for global reasoning.
Abstract
Graphs offer a natural way to formulate Multiple Object Tracking (MOT) and Multiple Object Tracking and Segmentation (MOTS) within the tracking-by-detection paradigm. However, they also introduce a major challenge for learning methods, as defining a model that can operate on such structured domain is not trivial. In this work, we exploit the classical network flow formulation of MOT to define a fully differentiable framework based on Message Passing Networks (MPNs). By operating directly on the graph domain, our method can reason globally over an entire set of detections and exploit contextual features. It then jointly predicts both final solutions for the data association problem and segmentation masks for all objects in the scene while exploiting synergies between the two tasks. We achieve state-of-the-art results for both tracking and segmentation in several publicly available…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Air Quality Monitoring and Forecasting · Domain Adaptation and Few-Shot Learning
