Online Adaptation of Convolutional Neural Networks for Video Object   Segmentation

Paul Voigtlaender; Bastian Leibe

arXiv:1706.09364·cs.CV·August 2, 2017·40 cites

Online Adaptation of Convolutional Neural Networks for Video Object Segmentation

Paul Voigtlaender, Bastian Leibe

PDF

Open Access

TL;DR

This paper introduces OnAVOS, an online adaptive method for semi-supervised video object segmentation that updates the model during testing to handle appearance changes, achieving state-of-the-art results.

Contribution

The paper proposes OnAVOS, an online adaptive extension to OSVOS, incorporating confidence-based updates and objectness pretraining for improved segmentation.

Findings

01

Achieves 85.7% IoU on DAVIS dataset.

02

Online adaptation significantly improves segmentation accuracy.

03

Pretraining on objectness enhances model robustness.

Abstract

We tackle the task of semi-supervised video object segmentation, i.e. segmenting the pixels belonging to an object in the video using the ground truth pixel mask for the first frame. We build on the recently introduced one-shot video object segmentation (OSVOS) approach which uses a pretrained network and fine-tunes it on the first frame. While achieving impressive performance, at test time OSVOS uses the fine-tuned network in unchanged form and is not able to adapt to large changes in object appearance. To overcome this limitation, we propose Online Adaptive Video Object Segmentation (OnAVOS) which updates the network online using training examples selected based on the confidence of the network and the spatial configuration. Additionally, we add a pretraining step based on objectness, which is learned on PASCAL. Our experiments show that both extensions are highly effective and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Video Surveillance and Tracking Methods