DASNet: Reducing Pixel-level Annotations for Instance and Semantic Segmentation
Chuang Niu, Shenghan Ren, Jimin Liang

TL;DR
DASNet is a novel architecture that leverages detection models to reduce the need for extensive pixel-level annotations in instance and semantic segmentation tasks, achieving high accuracy with fewer mask labels.
Contribution
The paper introduces DASNet, a detection-based segmentation framework that significantly decreases the reliance on pixel-level annotations while maintaining high segmentation quality.
Findings
Outperforms existing semi-supervised methods on PASCAL VOC 2012.
Effectively reduces annotation effort without sacrificing accuracy.
Demonstrates the benefit of detection-guided segmentation in limited annotation scenarios.
Abstract
Pixel-level annotation demands expensive human efforts and limits the performance of deep networks that usually benefits from more such training data. In this work we aim to achieve high quality instance and semantic segmentation results over a small set of pixel-level mask annotations and a large set of box annotations. The basic idea is exploring detection models to simplify the pixel-level supervised learning task and thus reduce the required amount of mask annotations. Our architecture, named DASNet, consists of three modules: detection, attention, and segmentation. The detection module detects all classes of objects, the attention module generates multi-scale class-specific features, and the segmentation module recovers the binary masks. Our method demonstrates substantially improved performance compared to existing semi-supervised approaches on PASCAL VOC 2012 dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
