Enhanced Masked Image Modeling to Avoid Model Collapse on Multi-modal MRI Datasets
Linxuan Han, Sa Xiao, Zimeng Li, Haidong Li, Xiuchao Zhao, Yeqing Han,, Fumin Guo, Xin Zhou

TL;DR
This paper introduces an enhanced masked image modeling approach with novel strategies to prevent model collapse in multi-modal MRI datasets, leading to more stable training and improved downstream task performance.
Contribution
It proposes the hybrid mask pattern and pyramid Barlow twins modules to address complete and dimensional collapse in self-supervised MRI modeling.
Findings
Prevents complete and dimensional collapse in multi-modal MRI MIM.
Improves stability and performance of downstream segmentation and classification.
Validated on three multi-modal MRI datasets.
Abstract
Multi-modal magnetic resonance imaging (MRI) provides information of lesions for computer-aided diagnosis from different views. Deep learning algorithms are suitable for identifying specific anatomical structures, segmenting lesions, and classifying diseases. Manual labels are limited due to the high expense, which hinders further improvement of accuracy. Self-supervised learning, particularly masked image modeling (MIM), has shown promise in utilizing unlabeled data. However, we spot model collapse when applying MIM to multi-modal MRI datasets. The performance of downstream tasks does not see any improvement following the collapsed model. To solve model collapse, we analyze and address it in two types: complete collapse and dimensional collapse. We find complete collapse occurs because the collapsed loss value in multi-modal MRI datasets falls below the normally converged loss value.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Speech Recognition and Synthesis · Machine Learning and ELM
MethodsMutual Information Machine/Mask Image Modeling · Barlow Twins
