Multi-dimensional Fusion and Consistency for Semi-supervised Medical Image Segmentation
Yixing Lu, Zhaoxin Fan, Min Xu

TL;DR
This paper presents a semi-supervised medical image segmentation framework that fuses ViT and CNN architectures with multi-scale text-aware features and employs multi-axis consistency for robust pseudo-label generation, demonstrating superior performance.
Contribution
The paper introduces a novel multi-scale text-aware ViT-CNN fusion scheme and a multi-axis consistency framework for improved semi-supervised medical image segmentation.
Findings
Effective fusion of ViT and CNN improves segmentation accuracy.
Multi-axis consistency enhances pseudo-label robustness.
Demonstrated superior results on multiple datasets.
Abstract
In this paper, we introduce a novel semi-supervised learning framework tailored for medical image segmentation. Central to our approach is the innovative Multi-scale Text-aware ViT-CNN Fusion scheme. This scheme adeptly combines the strengths of both ViTs and CNNs, capitalizing on the unique advantages of both architectures as well as the complementary information in vision-language modalities. Further enriching our framework, we propose the Multi-Axis Consistency framework for generating robust pseudo labels, thereby enhancing the semisupervised learning process. Our extensive experiments on several widelyused datasets unequivocally demonstrate the efficacy of our approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI
