TL;DR
This paper introduces CineMA, a versatile foundation model trained on 15 million cine CMR images, capable of performing multiple cardiac image analysis tasks with high accuracy, efficiency, and fairness, outperforming traditional CNNs and applicable across diverse populations.
Contribution
CineMA is the first multi-task, multi-view transformer-based foundation model for cine cardiac MRI analysis, demonstrating superior performance and generalizability over existing CNN models.
Findings
CineMA outperforms CNNs in ventricular segmentation and ejection fraction estimation.
CineMA maintains high performance with limited fine-tuning data.
CineMA can detect systemic disease-related cardiac changes and predict mortality.
Abstract
Here we present a versatile foundation model that can perform a range of clinically-relevant image analysis tasks, including segmentation, landmark localisation, diagnosis, and prognostication. A multi-view convolution-transformer masked autoencoder, named as CineMA, was trained on 15 million cine images from 74,916 subjects. The model was validated on multiple image analysis tasks and compared to existing models on >4,500 images from eight independent datasets with diverse population characteristics, representing the largest benchmark study for cine CMR so far. CineMA consistently outperformed conventional convolutional neural networks (CNNs) in delineating ventricular boundaries and estimating ejection fraction, a key measure of cardiac function. The improved performance was preserved, even when the model only used half of fine-tuning data. CineMA also surpassed CNNs in disease…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
