An Exploration of Target-Conditioned Segmentation Methods for Visual Object Trackers
Matteo Dunnhofer, Niki Martinel, Christian Micheloni

TL;DR
This paper investigates how existing target-conditioned segmentation methods can be used to convert bounding-box trackers into segmentation trackers, enabling more precise object localization while maintaining real-time performance.
Contribution
It provides an extensive analysis of segmentation methods for transforming bounding-box trackers into segmentation trackers, demonstrating competitive performance with recent segmentation trackers.
Findings
Segmentation methods enable real-time target tracking.
Converted trackers compete with dedicated segmentation trackers.
The approach enhances object localization precision.
Abstract
Visual object tracking is the problem of predicting a target object's state in a video. Generally, bounding-boxes have been used to represent states, and a surge of effort has been spent by the community to produce efficient causal algorithms capable of locating targets with such representations. As the field is moving towards binary segmentation masks to define objects more precisely, in this paper we propose to extensively explore target-conditioned segmentation methods available in the computer vision community, in order to transform any bounding-box tracker into a segmentation tracker. Our analysis shows that such methods allow trackers to compete with recently proposed segmentation trackers, while performing quasi real-time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
