WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations
Peidong Liu, Zibin He, Xiyu Yan, Yong Jiang, Shutao Xia, Feng Zheng,, Maowei Hu

TL;DR
WeClick introduces a weakly-supervised video semantic segmentation method using click annotations, leveraging memory flow knowledge distillation to improve accuracy and achieve real-time performance with minimal annotation effort.
Contribution
The paper presents a novel weakly-supervised segmentation pipeline that uses click annotations and memory flow distillation to enhance video segmentation accuracy.
Findings
Outperforms state-of-the-art methods by 10.24% mIoU on Cityscapes and Camvid.
Achieves real-time inference with low-cost click annotations.
Effectively exploits temporal information through memory flow knowledge distillation.
Abstract
Compared with tedious per-pixel mask annotating, it is much easier to annotate data by clicks, which costs only several seconds for an image. However, applying clicks to learn video semantic segmentation model has not been explored before. In this work, we propose an effective weakly-supervised video semantic segmentation pipeline with click annotations, called WeClick, for saving laborious annotating effort by segmenting an instance of the semantic class with only a single click. Since detailed semantic information is not captured by clicks, directly training with click labels leads to poor segmentation predictions. To mitigate this problem, we design a novel memory flow knowledge distillation strategy to exploit temporal information (named memory flow) in abundant unlabeled video frames, by distilling the neighboring predictions to the target frame via estimated motion. Moreover, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
MethodsKnowledge Distillation
