Zero-Shot Semantic Segmentation via Spatial and Multi-Scale Aware Visual Class Embedding
Sungguk Cha, Yooseung Wang

TL;DR
This paper introduces SM-VCENet, a zero-shot semantic segmentation framework that enhances visual class embeddings without language models, improving generalization and robustness on new benchmarks.
Contribution
The paper proposes a language-model-free zero-shot segmentation method with multi-scale and spatial attention, and introduces a new benchmark for evaluation.
Findings
Outperforms state-of-the-art zero-shot segmentation methods on PASCAL-5i.
Demonstrates robustness and generalization on the new PASCAL2COCO benchmark.
Enriches class embeddings with multi-scale and spatial visual information.
Abstract
Fully supervised semantic segmentation technologies bring a paradigm shift in scene understanding. However, the burden of expensive labeling cost remains as a challenge. To solve the cost problem, recent studies proposed language model based zero-shot semantic segmentation (L-ZSSS) approaches. In this paper, we address L-ZSSS has a limitation in generalization which is a virtue of zero-shot learning. Tackling the limitation, we propose a language-model-free zero-shot semantic segmentation framework, Spatial and Multi-scale aware Visual Class Embedding Network (SM-VCENet). Furthermore, leveraging vision-oriented class embedding SM-VCENet enriches visual information of the class embedding by multi-scale attention and spatial attention. We also propose a novel benchmark (PASCAL2COCO) for zero-shot semantic segmentation, which provides generalization evaluation by domain adaptation and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
MethodsAttentive Walk-Aggregating Graph Neural Network
