HODOR: High-level Object Descriptors for Object Re-segmentation in Video   Learned from Static Images

Ali Athar; Jonathon Luiten; Alexander Hermans; Deva Ramanan; Bastian; Leibe

arXiv:2112.09131·cs.CV·November 23, 2022·1 cites

HODOR: High-level Object Descriptors for Object Re-segmentation in Video Learned from Static Images

Ali Athar, Jonathon Luiten, Alexander Hermans, Deva Ramanan, Bastian, Leibe

PDF

Open Access 1 Repo

TL;DR

HODOR introduces a novel approach for video object segmentation that leverages high-level descriptors learned from static images, reducing reliance on dense video annotations and achieving state-of-the-art results.

Contribution

HODOR is the first method to use static image annotations to learn high-level object descriptors for re-segmentation in videos, enabling effective VOS without extensive video training data.

Findings

01

Achieves state-of-the-art performance on DAVIS and YouTube-VOS benchmarks.

02

Can learn from single annotated frames using cyclic consistency.

03

Operates without architectural modifications from image-based training.

Abstract

Existing state-of-the-art methods for Video Object Segmentation (VOS) learn low-level pixel-to-pixel correspondences between frames to propagate object masks across video. This requires a large amount of densely annotated video data, which is costly to annotate, and largely redundant since frames within a video are highly correlated. In light of this, we propose HODOR: a novel method that tackles VOS by effectively leveraging annotated static images for understanding object appearance and scene context. We encode object instances and scene information from an image frame into robust high-level descriptors which can then be used to re-segment those objects in different frames. As a result, HODOR achieves state-of-the-art performance on the DAVIS and YouTube-VOS benchmarks compared to existing methods trained without video annotations. Without any architectural modification, HODOR can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Ali2500/HODOR
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications

MethodsVOS