RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation
Haodi He, Yuhui Yuan, Xiangyu Yue, Han Hu

TL;DR
RankSeg introduces an adaptive segmentation framework that decomposes the task into multi-label classification and rank-adaptive pixel classification, improving performance across various segmentation tasks by focusing on relevant label subsets.
Contribution
The paper proposes a novel two-stage segmentation approach that dynamically selects relevant labels based on confidence scores, enhancing scalability and accuracy in large-category scenarios.
Findings
Achieved +0.8% on ADE20K panoptic segmentation
Achieved +0.7% on YouTubeVIS 2019 video instance segmentation
Achieved +0.7% on VSPW video semantic segmentation
Abstract
The segmentation task has traditionally been formulated as a complete-label pixel classification task to predict a class for each pixel from a fixed number of predefined semantic categories shared by all images or videos. Yet, following this formulation, standard architectures will inevitably encounter various challenges under more realistic settings where the scope of categories scales up (e.g., beyond the level of 1k). On the other hand, in a typical image or video, only a few categories, i.e., a small subset of the complete label are present. Motivated by this intuition, in this paper, we propose to decompose segmentation into two sub-problems: (i) image-level or video-level multi-label classification and (ii) pixel-level rank-adaptive selected-label classification. Given an input image or video, our framework first conducts multi-label classification over the complete label, then…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
