MedMIX: Modality-Internal Expert Fusion for Multimodal Medical Diagnosis
Seungik Cho, Anqi Li, Wei Qiu

TL;DR
MedMIX is a robust multimodal medical prediction framework that effectively combines expert models, learned fusion, and large-small model collaboration to handle incomplete modalities and sample variation.
Contribution
Introduces MedMIX, a novel framework unifying intra-modality expert fusion, learned inter-modality fusion, and large-small model collaboration for improved medical prediction.
Findings
Consistently strong performance across three benchmarks.
Robust under controlled missing-modality perturbations.
Maintains robustness under cross-cohort shift on MIMIC-III.
Abstract
Multimodal clinical prediction faces three challenges: multiple foundation models (FMs) with complementary strengths per modality, pervasive missing modalities at training and test time, and sample-specific variation in modality contributions. We introduce MedMIX, a multimodal framework that combines intra-modality expert fusion, learned inter-modality fusion, and training-only large--small model collaboration for robust medical prediction under incomplete modalities. Within each modality, MedMIX aggregates complementary embeddings from multiple small expert models; across modalities, it performs learned fusion over available modalities; and during training, it leverages large teacher models to improve deployed representations without additional inference cost. Across three heterogeneous benchmarks (OpenI, MIMIC-IV-MM, and MMIST-ccRCC), MedMIX achieves consistently strong performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
