Multimodal Fusion at Three Tiers: Physics-Driven Data Generation and Vision-Language Guidance for Brain Tumor Segmentation
Mingda Zhang

TL;DR
This paper introduces a three-tier multimodal fusion architecture for brain tumor segmentation that integrates physical modeling, Transformer-based feature fusion, and semantic guidance from GPT-4V, achieving state-of-the-art accuracy on multiple datasets.
Contribution
The novel three-tier fusion framework combines physical data augmentation, multi-modal feature fusion, and semantic guidance, advancing brain tumor segmentation accuracy and boundary localization.
Findings
Achieved Dice scores of 0.8665, 0.9014, and 0.8912 on BraTS 2020, 2021, and 2023 datasets.
Reduced Hausdorff Distance by an average of 6.57 mm compared to baseline.
Validated effectiveness across multiple datasets and modalities.
Abstract
Accurate brain tumor segmentation is crucial for neuro-oncology diagnosis and treatment planning. Deep learning methods have made significant progress, but automatic segmentation still faces challenges, including tumor morphological heterogeneity and complex three-dimensional spatial relationships. This paper proposes a three-tier fusion architecture that achieves precise brain tumor segmentation. The method processes information progressively at the pixel, feature, and semantic levels. At the pixel level, physical modeling extends magnetic resonance imaging (MRI) to multimodal data, including simulated ultrasound and synthetic computed tomography (CT). At the feature level, the method performs Transformer-based cross-modal feature fusion through multi-teacher collaborative distillation, integrating three expert teachers (MRI, US, CT). At the semantic level, clinical textual knowledge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Medical Image Segmentation Techniques
