A Dual-Mode ViT-Conditioned Diffusion Framework with an Adaptive Conditioning Bridge for Breast Cancer Segmentation
Prateek Singh, Moumita Dholey, P.K. Vinod

TL;DR
This paper introduces a dual-mode diffusion framework with an adaptive conditioning bridge and ViT encoder for improved breast cancer lesion segmentation in ultrasound images, achieving state-of-the-art results.
Contribution
It proposes a novel conditional diffusion model with an adaptive multi-scale fusion mechanism and a topological loss for accurate, anatomically consistent segmentation.
Findings
Achieved Dice scores of 0.96, 0.90, and 0.97 on three public datasets.
Validated the effectiveness of each model component through ablation studies.
Produced segmentation results that are both accurate and anatomically plausible.
Abstract
In breast ultrasound images, precise lesion segmentation is essential for early diagnosis; however, low contrast, speckle noise, and unclear boundaries make this difficult. Even though deep learning models have demonstrated potential, standard convolutional architectures frequently fall short in capturing enough global context, resulting in segmentations that are anatomically inconsistent. To overcome these drawbacks, we suggest a flexible, conditional Denoising Diffusion Model that combines an enhanced UNet-based generative decoder with a Vision Transformer (ViT) encoder for global feature extraction. We introduce three primary innovations: 1) an Adaptive Conditioning Bridge (ACB) for efficient, multi-scale fusion of semantic features; 2) a novel Topological Denoising Consistency (TDC) loss component that regularizes training by penalizing structural inconsistencies during denoising;…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Generative Adversarial Networks and Image Synthesis · Ultrasound Imaging and Elastography
