Lucid Data Dreaming for Video Object Segmentation

Anna Khoreva; Rodrigo Benenson; Eddy Ilg; Thomas Brox; Bernt Schiele

arXiv:1703.09554·cs.CV·March 15, 2019·24 cites

Lucid Data Dreaming for Video Object Segmentation

Anna Khoreva, Rodrigo Benenson, Eddy Ilg, Thomas Brox, Bernt Schiele

PDF

Open Access 4 Repos

TL;DR

This paper introduces a novel training strategy called Lucid Data Dreaming that synthesizes in-domain training data from minimal annotations, achieving state-of-the-art video object segmentation results with significantly less annotated data.

Contribution

The paper presents a new data synthesis approach that reduces the need for large annotated datasets by generating plausible future frames from minimal initial annotations.

Findings

01

Achieves top results with 20-1000x less annotated data.

02

Effective for both single and multiple object segmentation.

03

Training on in-domain synthesized data outperforms large generic datasets.

Abstract

Convolutional networks reach top quality in pixel-level video object segmentation but require a large amount of training data (1k~100k) to deliver such results. We propose a new training strategy which achieves state-of-the-art results across three evaluation datasets while using 20x~1000x less annotated data than competing methods. Our approach is suitable for both single and multiple object segmentation. Instead of using large training sets hoping to generalize across domains, we generate in-domain training data using the provided annotation on the first frame of each video to synthesize ("lucid dream") plausible future video frames. In-domain per-video training data allows us to train high quality appearance- and motion-based models, as well as tune the post-processing stage. This approach allows to reach competitive results even when training from only a single annotated frame,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Visual Attention and Saliency Detection · Advanced Neural Network Applications