Unify the Views: View-Consistent Prototype Learning for Few-Shot Segmentation
Hongli Liu, Yu Wang, Shengjie Zhao

TL;DR
This paper introduces VINE, a unified framework for few-shot segmentation that models structural consistency and view-invariance to improve prototype quality under viewpoint variations.
Contribution
VINE jointly models structural and view-invariant features using a spatial-view graph and foreground discrimination, enhancing prototype robustness in few-shot segmentation.
Findings
VINE outperforms existing methods on multiple benchmarks.
It demonstrates robustness under large viewpoint shifts.
The framework effectively integrates structural and view-invariant features.
Abstract
Few-shot segmentation (FSS) has gained significant attention for its ability to generalize to novel classes with limited supervision, yet remains challenged by structural misalignment and cross-view inconsistency under large appearance or viewpoint variations. This paper tackles these challenges by introducing VINE (View-Informed NEtwork), a unified framework that jointly models structural consistency and foreground discrimination to refine class-specific prototypes. Specifically, VINE introduces a spatial-view graph on backbone features, where the spatial graph captures local geometric topology and the view graph connects features from different perspectives to propagate view-invariant structural semantics. To further alleviate foreground ambiguity, we derive a discriminative prior from the support-query feature discrepancy to capture category-specific contrast, which reweights SAM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
