OmniMRI: A Unified Vision--Language Foundation Model for Generalist MRI Interpretation
Xingxin He, Aurora Rofena, Ruimin Feng, Haozhe Liao, Zhaoye Zhou, Albert Jang, and Fang Liu

TL;DR
OmniMRI is a comprehensive vision-language foundation model that unifies multiple MRI interpretation tasks within a single architecture, trained on extensive heterogeneous data to enhance generalizability and clinical utility.
Contribution
The paper introduces OmniMRI, a novel unified model that integrates vision and language for end-to-end MRI analysis, trained on large-scale diverse datasets for broad clinical applicability.
Findings
Successfully performs MRI reconstruction, segmentation, detection, and report generation.
Demonstrates strong cross-task generalization and instruction-following capabilities.
Consolidates multiple MRI workflows into a single scalable framework.
Abstract
Magnetic Resonance Imaging (MRI) is indispensable in clinical practice but remains constrained by fragmented, multi-stage workflows encompassing acquisition, reconstruction, segmentation, detection, diagnosis, and reporting. While deep learning has achieved progress in individual tasks, existing approaches are often anatomy- or application-specific and lack generalizability across diverse clinical settings. Moreover, current pipelines rarely integrate imaging data with complementary language information that radiologists rely on in routine practice. Here, we introduce OmniMRI, a unified vision-language foundation model designed to generalize across the entire MRI workflow. OmniMRI is trained on a large-scale, heterogeneous corpus curated from 60 public datasets, over 220,000 MRI volumes and 19 million MRI slices, incorporating image-only data, paired vision-text data, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
