Structure-Aware Feature Rectification with Region Adjacency Graphs for Training-Free Open-Vocabulary Semantic Segmentation
Qiming Huang, Hao Ai, Jianbo Jiao

TL;DR
This paper introduces a structure-aware feature rectification method using region adjacency graphs to improve open-vocabulary semantic segmentation by refining CLIP features with local structural priors.
Contribution
It proposes a novel approach that incorporates instance-specific local structural priors via RAGs to enhance CLIP-based segmentation without additional training.
Findings
Reduces segmentation noise and improves accuracy.
Enhances region-level consistency in segmentation.
Achieves state-of-the-art results on multiple benchmarks.
Abstract
Benefiting from the inductive biases learned from large-scale datasets, open-vocabulary semantic segmentation (OVSS) leverages the power of vision-language models, such as CLIP, to achieve remarkable progress without requiring task-specific training. However, due to CLIP's pre-training nature on image-text pairs, it tends to focus on global semantic alignment, resulting in suboptimal performance when associating fine-grained visual regions with text. This leads to noisy and inconsistent predictions, particularly in local areas. We attribute this to a dispersed bias stemming from its contrastive training paradigm, which is difficult to alleviate using CLIP features alone. To address this, we propose a structure-aware feature rectification approach that incorporates instance-specific priors derived directly from the image. Specifically, we construct a region adjacency graph (RAG) based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Natural Language Processing Techniques
