Temporally Propagated Masks and Bounding Boxes: Combining the Best of Both Worlds for Multi-Object Tracking
Tomasz Stanczyk, Francois Bremond

TL;DR
This paper introduces McByte, a novel multi-object tracking method that combines bounding boxes and temporally propagated segmentation masks to improve robustness and generalizability across diverse datasets.
Contribution
McByte innovatively integrates propagated masks with bounding boxes within a tracking-by-detection framework, enhancing multi-object tracking performance without extensive tuning.
Findings
Outperforms existing mask-based methods on multiple benchmarks
Demonstrates robustness and generalizability across diverse datasets
Achieves performance gains without per-sequence tuning
Abstract
Multi-object tracking (MOT) involves identifying and consistently tracking objects across video sequences. Traditional tracking-by-detection methods, while effective, often require extensive tuning and lack generalizability. On the other hand, segmentation mask-based methods are more generic but struggle with tracking management, making them unsuitable for MOT. We propose a novel approach, McByte, which incorporates a temporally propagated segmentation mask as a strong association cue within a tracking-by-detection framework. By combining bounding box and propagated mask information, McByte enhances robustness and generalizability without per-sequence tuning. Evaluated on four benchmark datasets - DanceTrack, MOT17, SoccerNet-tracking 2022, and KITTI-tracking - McByte demonstrates performance gain in all cases examined. At the same time, it outperforms existing mask-based methods.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods
