Online Adaptation of Convolutional Neural Networks for Video Object Segmentation
Paul Voigtlaender, Bastian Leibe

TL;DR
This paper introduces OnAVOS, an online adaptive method for semi-supervised video object segmentation that updates the model during testing to handle appearance changes, achieving state-of-the-art results.
Contribution
The paper proposes OnAVOS, an online adaptive extension to OSVOS, incorporating confidence-based updates and objectness pretraining for improved segmentation.
Findings
Achieves 85.7% IoU on DAVIS dataset.
Online adaptation significantly improves segmentation accuracy.
Pretraining on objectness enhances model robustness.
Abstract
We tackle the task of semi-supervised video object segmentation, i.e. segmenting the pixels belonging to an object in the video using the ground truth pixel mask for the first frame. We build on the recently introduced one-shot video object segmentation (OSVOS) approach which uses a pretrained network and fine-tunes it on the first frame. While achieving impressive performance, at test time OSVOS uses the fine-tuned network in unchanged form and is not able to adapt to large changes in object appearance. To overcome this limitation, we propose Online Adaptive Video Object Segmentation (OnAVOS) which updates the network online using training examples selected based on the confidence of the network and the spatial configuration. Additionally, we add a pretraining step based on objectness, which is learned on PASCAL. Our experiments show that both extensions are highly effective and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Video Surveillance and Tracking Methods
