WeClick: Weakly-Supervised Video Semantic Segmentation with Click   Annotations

Peidong Liu; Zibin He; Xiyu Yan; Yong Jiang; Shutao Xia; Feng Zheng,; Maowei Hu

arXiv:2107.03088·cs.CV·August 5, 2021

WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations

Peidong Liu, Zibin He, Xiyu Yan, Yong Jiang, Shutao Xia, Feng Zheng,, Maowei Hu

PDF

Open Access

TL;DR

WeClick introduces a weakly-supervised video semantic segmentation method using click annotations, leveraging memory flow knowledge distillation to improve accuracy and achieve real-time performance with minimal annotation effort.

Contribution

The paper presents a novel weakly-supervised segmentation pipeline that uses click annotations and memory flow distillation to enhance video segmentation accuracy.

Findings

01

Outperforms state-of-the-art methods by 10.24% mIoU on Cityscapes and Camvid.

02

Achieves real-time inference with low-cost click annotations.

03

Effectively exploits temporal information through memory flow knowledge distillation.

Abstract

Compared with tedious per-pixel mask annotating, it is much easier to annotate data by clicks, which costs only several seconds for an image. However, applying clicks to learn video semantic segmentation model has not been explored before. In this work, we propose an effective weakly-supervised video semantic segmentation pipeline with click annotations, called WeClick, for saving laborious annotating effort by segmenting an instance of the semantic class with only a single click. Since detailed semantic information is not captured by clicks, directly training with click labels leads to poor segmentation predictions. To mitigate this problem, we design a novel memory flow knowledge distillation strategy to exploit temporal information (named memory flow) in abundant unlabeled video frames, by distilling the neighboring predictions to the target frame via estimated motion. Moreover, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques

MethodsKnowledge Distillation