Multimodal Information Interaction for Medical Image Segmentation

Xinxin Fan; Lin Liu; Haoran Zhang

arXiv:2404.16371·cs.CV·April 26, 2024·2 cites

Multimodal Information Interaction for Medical Image Segmentation

Xinxin Fan, Lin Liu, Haoran Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces MicFormer, a novel multimodal transformer architecture that effectively fuses features from different medical imaging modalities, significantly improving segmentation accuracy in multimodal medical images.

Contribution

The paper proposes MicFormer, a dual-stream cross transformer with deformable architecture, for better multimodal feature integration in medical image segmentation.

Findings

01

Achieved a DICE score of 85.57 on whole-heart segmentation.

02

Outperformed existing methods by margins of 2.83 in DICE and 4.23 in MIoU.

03

Demonstrated effective multimodal feature communication and fusion.

Abstract

The use of multimodal data in assisted diagnosis and segmentation has emerged as a prominent area of interest in current research. However, one of the primary challenges is how to effectively fuse multimodal features. Most of the current approaches focus on the integration of multimodal features while ignoring the correlation and consistency between different modal features, leading to the inclusion of potentially irrelevant information. To address this issue, we introduce an innovative Multimodal Information Cross Transformer (MicFormer), which employs a dual-stream architecture to simultaneously extract features from each modality. Leveraging the Cross Transformer, it queries features from one modality and retrieves corresponding responses from another, facilitating effective communication between bimodal features. Additionally, we incorporate a deformable Transformer architecture to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fxxjuses/micformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques

MethodsAttention Is All You Need · Dropout · Dense Connections · Label Smoothing · Residual Connection · Softmax · Position-Wise Feed-Forward Layer · Linear Layer · Byte Pair Encoding · Absolute Position Encodings