Pan-FM: A Pan-Organ Foundation Model with Saliency-Guided Masking for Missing Robustness
Qiangqiang Wu, Grace McIlvain, Zhou Yu, Junhao Wen

TL;DR
Pan-FM is a novel multimodal foundation model trained on seven organs, using saliency-guided masking to improve robustness and reduce bias in whole-body medical imaging tasks.
Contribution
Introduces Pan-FM, a unified backbone handling missing-organ data with saliency-guided masking to enhance multi-organ representation learning.
Findings
Outperforms single-organ and multi-organ baselines on disease prediction tasks.
Achieves stronger prediction accuracy across 13 disease categories.
Demonstrates improved robustness under missing-organ scenarios.
Abstract
Foundation models (FMs) have shown great promise in medical imaging, but most FMs are trained on unimodal data within isolated domains, such as brain MRI alone. Human aging and disease arise through coordinated biological processes across organs, therefore motivating multimodal FMs that learn whole-body representations. A key challenge, however, is that real-world multimodal biomedical data are often missing not at random, which can reduce power, limit generalizability, and introduce bias. We propose Pan-FM, a pan-organ foundation model pre-trained on imaging from seven organs (Brain, Heart, Adipose, Liver, Kidney, Spleen, and Pancreas) under realistic missing-organ scenarios. Pan-FM uses a unified backbone that handles organ missingness during both training and inference, and is pre-trained with masking-based self-distillation. We find that naive multimodal pre-training leads to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
