TL;DR
UniSemAlign introduces a novel semi-supervised histopathology segmentation framework that leverages class-level structure through prototype and text alignment, significantly improving performance with limited labeled data.
Contribution
It presents a dual-modal alignment approach using a pathology-pretrained Transformer encoder to enhance semi-supervised segmentation accuracy.
Findings
Achieves up to 2.6% Dice improvement on GlaS with 10% labels.
Achieves up to 8.6% Dice improvement on CRAG with 10% labels.
Outperforms recent semi-supervised baselines on two datasets.
Abstract
Semi-supervised semantic segmentation in computational pathology remains challenging due to scarce pixel-level annotations and unreliable pseudo-label supervision. We propose UniSemAlign, a dual-modal semantic alignment framework that enhances visual segmentation by injecting explicit class-level structure into pixel-wise learning. Built upon a pathology-pretrained Transformer encoder, UniSemAlign introduces complementary prototype-level and text-level alignment branches in a shared embedding space, providing structured guidance that reduces class ambiguity and stabilizes pseudo-label refinement. The aligned representations are fused with visual predictions to generate more reliable supervision for unlabeled histopathology images. The framework is trained end-to-end with supervised segmentation, cross-view consistency, and cross-modal alignment objectives. Extensive experiments on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
