Temporally stable video segmentation without video annotations

Aharon Azulay; Tavi Halperin; Orestis Vantzos; Nadav Borenstein; Ofir; Bibi

arXiv:2110.08893·cs.CV·March 18, 2022

Temporally stable video segmentation without video annotations

Aharon Azulay, Tavi Halperin, Orestis Vantzos, Nadav Borenstein, Ofir, Bibi

PDF

Open Access 1 Video

TL;DR

This paper presents an unsupervised method to adapt still image segmentation models for stable, temporally consistent video segmentation by leveraging optical flow and a consistency measure validated against human judgment.

Contribution

It introduces a novel unsupervised approach combining optical flow-based consistency with a multi-input decoder to improve video segmentation stability without video annotations.

Findings

01

Enhanced temporal stability in video segmentation results

02

Minimal loss of accuracy compared to image-based models

03

Validated consistency measure correlates well with human judgment

Abstract

Temporally consistent dense video annotations are scarce and hard to collect. In contrast, image segmentation datasets (and pre-trained models) are ubiquitous, and easier to label for any novel task. In this paper, we introduce a method to adapt still image segmentation models to video in an unsupervised manner, by using an optical flow-based consistency measure. To ensure that the inferred segmented videos appear more stable in practice, we verify that the consistency measure is well correlated with human judgement via a user study. Training a new multi-input multi-output decoder using this measure as a loss, together with a technique for refining current image segmentation datasets and a temporal weighted-guided filter, we observe stability improvements in the generated segmented videos with minimal loss of accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Temporally stable video segmentation without video annotations· youtube

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis