Tracking through Containers and Occluders in the Wild

Basile Van Hoorick; Pavel Tokmakov; Simon Stent; Jie Li; Carl Vondrick

arXiv:2305.03052·cs.CV·May 5, 2023·2 cites

Tracking through Containers and Occluders in the Wild

Basile Van Hoorick, Pavel Tokmakov, Simon Stent, Jie Li, Carl Vondrick

PDF

Open Access 1 Repo

TL;DR

This paper introduces TCOW, a benchmark and model for tracking objects through heavy occlusion and containment in cluttered environments, highlighting current model limitations in understanding object permanence.

Contribution

The paper presents a new benchmark and dataset for tracking through occlusion and containment, along with an evaluation of transformer-based models on this challenging task.

Findings

01

Transformer models perform variably under occlusion conditions.

02

Significant performance gap remains in understanding object permanence.

03

The dataset supports both supervised learning and structured evaluation.

Abstract

Tracking objects with persistence in cluttered and dynamic environments remains a difficult challenge for computer vision systems. In this paper, we introduce $TCOW$ , a new benchmark and model for visual tracking through heavy occlusion and containment. We set up a task where the goal is to, given a video sequence, segment both the projected extent of the target object, as well as the surrounding container or occluder whenever one exists. To study this task, we create a mixture of synthetic and annotated real datasets to support both supervised learning and structured evaluation of model performance under various forms of task variation, such as moving or nested containment. We evaluate two recent transformer-based video models and find that while they can be surprisingly capable of tracking targets under certain settings of task variation, there remains a considerable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

basilevh/tcow
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Visual Attention and Saliency Detection · Human Pose and Action Recognition