WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Models
Xinjian Wu, Ruisong Zhang, Jie Qin, Shijie Ma, Cheng-Lin Liu

TL;DR
WPS-SAM introduces a weakly-supervised framework leveraging foundation models for part segmentation, achieving high accuracy with minimal annotation, thus advancing object part recognition in computer vision.
Contribution
The paper proposes WPS-SAM, a novel weakly-supervised part segmentation method built on foundation models, enabling effective pixel-level segmentation with limited supervision.
Findings
WPS-SAM achieves 68.93% mIOU on PartImageNet.
Outperforms fully supervised methods by about 4% in mIOU.
Uses only bounding boxes or points for training, reducing annotation effort.
Abstract
Segmenting and recognizing diverse object parts is crucial in computer vision and robotics. Despite significant progress in object segmentation, part-level segmentation remains underexplored due to complex boundaries and scarce annotated data. To address this, we propose a novel Weakly-supervised Part Segmentation (WPS) setting and an approach called WPS-SAM, built on the large-scale pre-trained vision foundation model, Segment Anything Model (SAM). WPS-SAM is an end-to-end framework designed to extract prompt tokens directly from images and perform pixel-level segmentation of part regions. During its training phase, it only uses weakly supervised labels in the form of bounding boxes or points. Extensive experiments demonstrate that, through exploiting the rich knowledge embedded in pre-trained foundation models, WPS-SAM outperforms other segmentation models trained with pixel-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Handwritten Text Recognition Techniques · Natural Language Processing Techniques
