Visual Boundary Knowledge Translation for Foreground Segmentation
Zunlei Feng, Lechao Cheng, Xinchao Wang, Xiang Wang, Yajie Liu,, Xiangtong Du, Mingli Song

TL;DR
This paper introduces Boundary Knowledge Translation (BKT) and a Translation Segmentation Network (Trans-Net) to enable effective foreground segmentation of unseen categories with minimal labeled data, mimicking human boundary recognition.
Contribution
The paper proposes a novel BKT task and a Trans-Net model that transfers boundary knowledge from labeled to novel categories using limited supervision.
Findings
Trans-Net achieves near fully supervised performance with few labeled samples.
Boundary-aware self-supervision improves segmentation accuracy.
Adversarial boundary discriminators enhance generalization to unseen categories.
Abstract
When confronted with objects of unknown types in an image, humans can effortlessly and precisely tell their visual boundaries. This recognition mechanism and underlying generalization capability seem to contrast to state-of-the-art image segmentation networks that rely on large-scale category-aware annotated training samples. In this paper, we make an attempt towards building models that explicitly account for visual boundary knowledge, in hope to reduce the training effort on segmenting unseen categories. Specifically, we investigate a new task termed as Boundary Knowledge Translation (BKT). Given a set of fully labeled categories, BKT aims to translate the visual boundary knowledge learned from the labeled categories, to a set of novel categories, each of which is provided only a few labeled samples. To this end, we propose a Translation Segmentation Network (Trans-Net), which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsVisual Attention and Saliency Detection · Remote-Sensing Image Classification · Advanced Image and Video Retrieval Techniques
