TL;DR
This paper introduces a novel multi-encoder U-Net architecture for 3D prostate lesion segmentation from MRI, leveraging text-guided alignment, heatmap calibration, and localized boundary refinement to improve accuracy.
Contribution
It proposes a new multi-encoder U-Net with innovative loss functions and a confidence-gated refiner, achieving state-of-the-art results in prostate lesion segmentation.
Findings
Outperforms previous methods on the PI-CAI dataset
Enhances multi-modal fusion with text-guided alignment
Improves localized boundary accuracy with confidence gating
Abstract
Automated 3D segmentation of prostate lesions from biparametric MRI (bp-MRI) is essential for reliable algorithmic analysis, but achieving high precision remains challenging. Volumetric methods must combine multiple modalities while ensuring anatomical consistency, but current models struggle to integrate cross-modal information reliably. While vision-language models (VLMs) are replacing the currently used architectural designs, they still lack the fine-grained, lesion-level semantics required for effective localized guidance. To address these limitations, we propose a new multi-encoder U-Net architecture incorporating three key innovations: (1) an alignment loss that enhances foreground text-image similarity to inject lesion semantics; (2) a heatmap loss that calibrates the similarity map and suppresses spurious background activations; and (3) a final-stage, confidence-gated multi-head…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
