ConformalSAM: Unlocking the Potential of Foundational Segmentation Models in Semi-Supervised Semantic Segmentation with Conformal Prediction

Danhui Chen; Ziquan Liu; Chuxi Yang; Dan Wang; Yan Yan; Yi Xu; Xiangyang Ji

arXiv:2507.15803·cs.CV·July 22, 2025

ConformalSAM: Unlocking the Potential of Foundational Segmentation Models in Semi-Supervised Semantic Segmentation with Conformal Prediction

Danhui Chen, Ziquan Liu, Chuxi Yang, Dan Wang, Yan Yan, Yi Xu, Xiangyang Ji

PDF

TL;DR

This paper introduces ConformalSAM, a semi-supervised segmentation framework that uses conformal prediction to calibrate foundation models, improving label reliability and segmentation performance with limited labeled data.

Contribution

ConformalSAM is the first to integrate conformal prediction with foundation models for semi-supervised segmentation, enhancing label quality and model generalization.

Findings

01

Outperforms recent semi-supervised segmentation methods on standard benchmarks

02

Effectively filters unreliable labels using conformal prediction

03

Boosts performance of existing methods as a plug-in

Abstract

Pixel-level vision tasks, such as semantic segmentation, require extensive and high-quality annotated data, which is costly to obtain. Semi-supervised semantic segmentation (SSSS) has emerged as a solution to alleviate the labeling burden by leveraging both labeled and unlabeled data through self-training techniques. Meanwhile, the advent of foundational segmentation models pre-trained on massive data, has shown the potential to generalize across domains effectively. This work explores whether a foundational segmentation model can address label scarcity in the pixel-level vision task as an annotator for unlabeled images. Specifically, we investigate the efficacy of using SEEM, a Segment Anything Model (SAM) variant fine-tuned for textual input, to generate predictive masks for unlabeled data. To address the shortcomings of using SEEM-generated masks as supervision, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.