Boosting Low-Data Instance Segmentation by Unsupervised Pre-training with Saliency Prompt
Hao Li, Dingwen Zhang, Nian Liu, Lechao Cheng, Yalun Dai, Chao Zhang,, Xinggang Wang, Junwei Han

TL;DR
This paper introduces an unsupervised pre-training method using saliency prompts to improve query-based instance segmentation models in low-data scenarios, achieving performance comparable to CNN-based models.
Contribution
It presents a novel pre-training approach that leverages saliency masks and prompting techniques to enhance QEIS models with limited data.
Findings
Significant performance boost on three datasets
Achieves similar convergence speed to CNN models in low-data regimes
Enhances QEIS models with unsupervised pre-training
Abstract
Recently, inspired by DETR variants, query-based end-to-end instance segmentation (QEIS) methods have outperformed CNN-based models on large-scale datasets. Yet they would lose efficacy when only a small amount of training data is available since it's hard for the crucial queries/kernels to learn localization and shape priors. To this end, this work offers a novel unsupervised pre-training solution for low-data regimes. Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models by giving Saliency Prompt for queries/kernels. Our method contains three parts: 1) Saliency Masks Proposal is responsible for generating pseudo masks from unlabeled images based on the saliency mechanism. 2) Prompt-Kernel Matching transfers pseudo masks into prompts and injects the corresponding localization and shape priors to the best-matched…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
MethodsAttention Is All You Need · Dense Connections · Layer Normalization · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer · Dropout · Softmax · Convolution · Absolute Position Encodings
