MyoNet: Deep Learning-Based Myocardial Strain Quantification from Cine Cardiac MRI
Dayeong An, Andrew Nencka, Patrick Clarysse, Pierre Croisille, Carmen Bergom, El-Sayed Ibrahim

TL;DR
MyoNet is a deep learning tool that improves the accuracy of measuring heart muscle strain from MRI images, outperforming existing methods.
Contribution
MyoNet introduces a novel deep learning network for myocardial strain quantification with superior performance over existing models.
Findings
MyoNet outperformed ResMyoNet in myocardial strain measurement with high SSIM, ICC, and Pearson CC values.
MyoNet demonstrated high consistency with SinMod-derived reference strains for both circumferential and radial strains.
The model's accuracy and efficiency were validated through comprehensive statistical analyses.
Abstract
To develop and assess MyoNet, a deep learning (DL)-based network for measuring myocardial regional function from cine cardiac magnetic resonance (CMR) images, and compare its efficacy with ResMyoNet as an efficient alternative to SinMod-derived reference. MyoNet was tested alongside ResMyoNet on datasets from Dahl salt-sensitive rat models undergoing radiation therapy (RT). Both networks were designed to extract displacement maps from cine images, were specifically optimized for detailed myocardial deformation, employed advanced convolution operations with alternating kernel sizes for spatial and temporal analysis, and robust loss functions. MyoNet demonstrated superior performance in myocardial strain measurement, achieving high consistency with the SinMod-derived reference strains. It outperformed ResMyoNet, achieving higher performance metrics, including SSIM of 0.961 and 0.960, ICC…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCardiovascular Function and Risk Factors · Cardiac Imaging and Diagnostics · Cardiac Fibrosis and Remodeling
1. Introduction
The significance of regional cardiac function analysis, especially using cardiac resonance (CMR) myocardial strain analysis, has gained increased attention. Myocardial strain allows for a detailed understanding of myocardial contractility compared to conventional global function measures. Traditionally, the evaluation of cardiac function has mainly relied on measuring left ventricular ejection fraction (LVEF), a valuable yet limited tool that primarily provides global information about cardiac function without the ability to detect localized myocardial contractility abnormalities or dysfunctions [1]. Therefore, LVEF can mask underlying subclinical cardiac dysfunction, thus delaying the detection of heart diseases, as in thoracic radiation-induced cardiotoxicity [1,2]. Radiation therapy (RT), essential in cancer treatment, can inadvertently damage the heart, leading to long-term complications including ischemic heart disease, fibrosis, arrhythmias, cardiomyopathy, valvular abnormalities, and pericarditis, particularly in breast and lung cancer treatments [3,4,5,6,7]. These complications often emerge subclinically, making early detection crucial [6,8].
Animal models, such as Dahl salt-sensitive (SS) rats, provide valuable insights into cardiotoxic implications post-RT [9,10], helping understand disease progression and potential interventions. Cardiac magnetic resonance (CMR) has emerged as integral in measuring regional cardiac function, offering non-invasive evaluation without ionizing radiation. CMR sequences, like tagging [11] and CMR feature tracking (CMR-FT) [12], have been developed for this purpose. However, CMR tagging is not widely adopted in clinical practice due to the necessary longer acquisition times and the need for special post-processing techniques [13,14,15], while CMR-FT lacks intramyocardial markers, affecting accuracy in regional strain analysis [16,17]. These challenges are exacerbated in small animal models due to differences in image contrast and magnetic field strengths in pre-clinical and clinical MRI studies [18,19].
In this study, we propose the integration of deep learning (DL) with cine CMR to generate fast, accurate, and reproducible myocardial strain analysis without the need for tagging. Specifically, we use supervised learning to allow DL models to learn from tagged examples and predict strain from cine images. By leveraging the detailed displacement fields from the tagged images, our goal is to design, train, and compare two DL networks, MyoNet and ResMyoNet, to create an accurate, rapid algorithm for generating myocardial strain measurements from routinely acquired cine images. We hypothesize that these networks will allow for effective and efficient estimation of myocardial strain, therefore providing innovative solutions for early detection of subclinical cardiac function in cardiotoxicity.
2. Materials and Methods
2.1. Study Population and Data Preprocessing
This study (Figure 1) was approved by the institutional animal care committee of the Medical College of Wisconsin and used data from 22 Dahl SS rats undergoing RT, yielding 64 short-axis slices, each containing 20 cine and corresponding tagged timeframes of cardiac MRI images, as previously described [20,21]. The initial preprocessing involved a manual segmentation of the left ventricle (LV) in cine images to highlight the LV myocardium by identifying the endocardium and epicardium contours to generate binary images. All images were then cropped to uniform dimensions of 128 × 128.
For tagged images, the SinMod technique (InTag, Lyon, France) [14,22] was utilized to generate x- and y-displacement fields, which served as the reference standard (ground truth) for network training. These displacement fields were aligned with the segmented cine images and rescaled to an intensity range between 32 and 255 for clear differentiation of the image background (set to 0). This scaling was an invertible affine normalization used only to bound network outputs. Predicted values were transformed back to signed displacement in the original units using the inverse affine mapping, and all displacement error metrics and strain calculations were performed after this inverse transform. These displacement fields were also standardized to dimensions of 128 × 128 through cropping to match cine images.
2.2. MyoNet
MyoNet is a novel deep learning network tailored for myocardial strain analysis using spatiotemporal data in CMR. It utilizes the capabilities of DL for processing spatial (in-slice tissue motion) and temporal (tissue motion across consecutive timeframes) dimensions, aiming to improve the accuracy and efficiency of strain analysis.
MyoNet, similar to the conventional Unet [23,24] (Figure 2), includes five encoding and decoding layers, capturing both local features through the down-sampling path (encoder) and larger context through the up-sampling path (decoder).
In the encoding stage, MyoNet employs sequential 3D convolution operations followed by a max pooling operation with alternating kernel sizes (1 × 3 × 3 and 3 × 1 × 1) for distinct spatial and temporal feature capture. Dilation in deeper layers, combined with varied kernel sizes and alternating dilation rates (1 × 2 × 2 and 2 × 1 × 1), expands the receptive field, allowing global feature learning across a wider spatiotemporal extent without compromising resolution. The decoder, symmetrical to the encoder, uses up-sampling and convolutional blocks, enriched by skip connections to preserve high-frequency details. The network concludes with a 1 × 1 × 1 convolution and a sigmoid activation function, normalizing the output [25].
MyoNet is optimized for straightforward and efficient feature extraction, minimizing overfitting and enhancing interpretability. Its architecture focuses on context understanding within the data, crucial for myocardial strain analysis.
2.3. ResMyoNet
ResMyoNet exhibits significant similarities to MyoNet, especially when handling spatiotemporal data (Figure 3), and distinguishes itself with its modified ResUnet architecture [26]. It incorporates residual connections for efficient gradient flow during training. It employs a series of layers with custom residual blocks, each comprising a shortcut path and a main path with spatiotemporal convolutions, followed by batch normalization and ReLU activation. The tanh activation function is applied post-convolution, aiding in gradient propagation and training stability [27,28].
ResMyoNet’s residual connections not only facilitate faster convergence and mitigate the vanishing gradient problem but also contribute to its ability for deeper and more intricate feature extraction.
2.4. Experimental Setup
Our dataset included 64 short-axis slices from 22 scans, each comprising 20 cine images and corresponding SinMod-derived x- and y-displacement fields. The input data were standardized by normalizing to ensure consistency for the network’s learning process. Additionally, data augmentation techniques were applied to enhance diversity and assist in model generalization. The dataset was divided at the scan level (to prevent data leakage between sets) into distinct training (80%), validation (10%), and testing sets (10%). The training set underwent further data augmentation through random flip techniques to enhance the model’s robustness and adaptability. The validation set was crucial for hyperparameter tuning and performance evaluation, while the testing set served as the final model assessment on unseen data.
Model implementation was conducted using PyTorch version 2.0.1 and Python 3.11 on a high-performance computing system equipped with an NVIDIA Quadro GV100 GPU. Training utilized the RMSprop optimizer with an initial learning rate of 1 × 10^−3^. The batch size was set to 4. Both networks were trained for 100 epochs, with model selection based on the best validation loss.
2.5. Optimization and Loss Functions
During the training, a combination of loss functions was utilized to effectively guide the optimization process. First, the mean squared error (MSE) loss was used to penalize large errors and improve accuracy in estimating x- and y-displacement fields. The MSE is defined as follows:
Here and signify the normalized pixel values representing x- and y-displacement fields generated by tagging, while and are the predicted x- and y-displacement fields from our models at the time frame and is the number of frames. Simultaneously, the smooth L1 loss was introduced, functioning akin to the L2 loss for small differences and resembling the L1 loss for larger differences, which is formulated as follows:
Here is the pixel location while is the time frame. This combination fosters a balance that enhances robustness to outliers, maintaining the capacity to underscore significant inconsistencies. Beyond these, a specialized loss was crafted exclusively for this application. This custom loss emphasizes errors where the segmented areas of the x- and y-displacement fields were generated by tagging.
As mentioned, the x- and y-displacement fields were rescaled to pixel values ranging from 32 to 255. After normalization, this range is transformed to 0.125 to 1.
The total loss function was computed as a combination of the three components:
2.6. Statistical Analysis
The model performance of MyoNet and ResMyoNet was evaluated using several metrics. For displacement field assessment, we computed the structural similarity index (SSIM) [29] for image quality, root mean squared error (RMSE), and mean endpoint error (EPE) for vector displacement accuracy. Additionally, strains derived from MyoNet and ResMyoNet were compared with SinMod-derived reference values using the intraclass correlation coefficients (ICC) [30], Pearson correlation coefficient [31], and coefficient of variation (CV). To evaluate the agreement between strains acquired from MyoNet, ResMyoNet, and SinMod, a Bland–Altman plot analysis was also performed [32]. Training accuracy was defined as 1 minus the normalized mean squared error between predicted and target displacement fields.
2.7. Strain Computation
Circumferential (Ecc) and radial (Err) strains were computed from the predicted x- and y-displacement fields using a Lagrangian formulation. For each timeframe, the 2D displacement vector was converted to a deformation gradient , where spatial gradients were calculated on the grid using central finite differences within the myocardium mask. The Green-Lagrange strain tensor was computed as [33]. For segment analysis, pixel-wise strains were averaged within each of the six AHA short-axis segments [34].
3. Results
3.1. Strain Analysis by MyoNet and ResMyoNet
Figure 4 displays the x- and y-displacement fields obtained from SinMod, MyoNet, and ResMyoNet. The accompanying error maps provide insight into the differences between these displacement fields, underscoring the differences and potential implications for the fidelity and accuracy of each approach in evaluating myocardial displacements. Following this, bar plots in Figure 5 were generated to depict the circumferential (Ecc) and radial (Err) strains obtained through three distinct imaging techniques: tagging (gold standard), MyoNet, and ResMyoNet. These measurements were specifically performed for each of the six segments per slice as defined by the American Heart Association (AHA) segmentation model [34]. These bar plots by the three methods (SinMod-derived reference, MyoNet, and ResMyoNet) show remarkably analogous patterns. In addition, a global analysis was conducted to provide a comprehensive assessment of the strain metrics across all segments. Both Ecc and Err for each of six segments per slice underwent statistical analysis using Student’s t-test with Bonferroni correction for multiple comparisons. No segments showed a statistically significant difference after correction (p < 0.05). Without correction, one segment (anterior) in Err from MyoNet and one segment (inferoseptal) in Err from ResMyoNet were statistically different when compared to SinMod-derived values (Table 1).
Bland–Altman plots were constructed (Figure 6) to compare the strain measures between SinMod vs. MyoNet and SinMod vs. ResMyoNet. Bland–Altman analysis demonstrated minimal systematic bias between methods. For Ecc, MyoNet showed a mean bias of −1.9%, while ResMyoNet showed a mean bias of −2.0%. For Err, MyoNet demonstrated a mean bias of 1.0%, while ResMyoNet showed a mean bias of 2.0%. These small biases suggest clinically acceptable agreement between deep learning methods and the SinMod reference. This consistency is especially noteworthy as the data points encompass all segments from the test dataset.
3.2. Model Performance and Computational Efficiency
The performance consistency of both MyoNet and ResMyoNet networks was further assessed through an analysis of accuracy and loss over the span of the training epochs. As demonstrated in Figure 7, while both models underwent training for 100 epochs, ResMyoNet notably reached high and stable accuracy levels along with low and consistent loss values within just a few initial epochs. This rapid stabilization contrasts with MyoNet, which required more epochs to achieve similar performance metrics.
A comparative analysis was conducted to discern the efficiency of the MyoNet and ResMyoNet in the context of training, characterized by evaluating their computation times across 10 and 100 epochs. MyoNet required a computation time of 158 s for 10 epochs, escalating to 1546 s for 100 epochs. In contrast, ResMyoNet requires only 124 s for 10 epochs and 1177 s for 100 epochs. For all measurements, ResMyoNet consistently exhibited shorter computation durations when compared to MyoNet, emphasizing its relative efficiency in the training phase.
3.3. Performance Metrics of MyoNet and ResMyoNet
To assess and compare the performance of the MyoNet and ResMyoNet networks for generating Ecc and Err, we utilized multiple metrics (Table 2). For displacement field accuracy, MyoNet achieved RMSE of 1.06 mm in x and 0.94 mm in y-displacement fields, while ResMyoNet achieved RMSE of 1.24 mm in x and 1.00 mm in y-displacement fields. Examining the SSIM values, MyoNet achieved higher values (0.961, 0.960) compared to ResMyoNet (0.937, 0.934). As for ICC, MyoNet displayed slightly higher than ResMyoNet. For the Pearson CC analysis, both networks showed strong correlations. Lastly, concerning the CV metrics, MyoNet’s values were presented as higher than ResMyoNet’s.
4. Discussion
We demonstrate that the developed MyoNet and ResMyoNet each show unique characteristics and differ in their performance metrics, underscoring the distinct capabilities and strengths of each network.
In contrast to existing methodologies [35], which focused on mice with a small cohort size (n = 8), the proposed algorithm is validated on a larger and more diverse dataset. This enhances the generalizability of the results. Moreover, while the approach in [35] requires manual localization of the LV blood pool center point, it automates this step, thereby reducing potential sources of human error and increasing efficiency. The Siamese architecture’s bidirectional motion learning [36] offers a dual pathway for processing input data, allowing the simultaneous analysis of different frames. This approach ensures that the temporal relationships between consecutive frames are well = established, fostering a more precise capture of myocardial motion dynamics. However, bidirectional motion learning may overlook subtle myocardial motion patterns that do not manifest across both pathways, and this might make the model more sensitive to irregularities or noise in input data sequences.
The use of the 3D convolutional approach (spatiotemporal) over the 2D (spatial) with subsequent 1D (temporal) strategy (2D + 1D) [37] is multifaceted. The 3D convolution inherently integrates spatial and temporal information in a unified manner, capturing the intricate relationships across three dimensions. In contrast, the 2D + 1D approach would first process spatial data and subsequently add the temporal dimension, potentially missing interactions between spatial and temporal features that are crucial for myocardial strain analysis. Importantly, the proposed networks alternate kernel sizes that focus on the spatial and temporal dimensions alternatively. MyoNet and ResMyoNet, with their unique designs, emphasize the proficient extraction of both spatial and temporal features, illustrated by kernel size alternations. Traditional methodologies (non-alternating kernel size) often grapple with the complexity of parsing data that spans both spatial and temporal dimensions. To address this, the proposed networks leverage deep learning’s inherent strengths in processing spatial and temporal information synergistically, aiming to augment the accuracy and efficiency of strain analysis in cardiac imaging. This indicates a dedicated mechanism for assimilating spatial and temporal variations within the input data. Moreover, the convolutional operations employ dilation, expanding the receptive field without compromising spatial resolution. This has the capability to capture extensive spatiotemporal patterns, possibly elucidating the superior accuracy metrics associated with MyoNet and ResMyoNet.
ResMyoNet integrates the potency of residual connections, known for circumventing the vanishing gradient problem, ensuring efficient backpropagation even in deeper architectures [28]. This feature is likely pivotal in facilitating a rapid training convergence for ResMyoNet, a probable explanation for its reduced training computational time. Yet, it is paramount to acknowledge inherent constraints. While MyoNet’s focus on detail improves accuracy, it concurrently demands more computational resources. ResMyoNet, although optimized for efficiency, might present a subtle compromise on intricate detail extraction in certain cases. These tradeoffs underscore the balance between accuracy and efficiency in deep learning models. If optimized accuracy and attention to detail are the priority, then MyoNet is the preferred choice for the user. However, when rapid training and computational efficiency are crucial, ResMyoNet is the more appropriate choice.
The selection of loss functions is crucial to balance accuracy with robustness. The mean squared error (MSE) ensures a foundational penalty for discrepancies in predicted x- and y-displacement fields. However, to address potential outliers such as anomalous displacement fields, inaccurate predictions, and data acquisition inconsistencies, the smooth L1 loss was employed. This loss transitions between penalizing small and large discrepancies, enhancing model robustness. Furthermore, the introduction of a custom loss function focuses on regions of pronounced cardiac strain. By emphasizing errors in segmented displacement areas and setting thresholds for significant displacements, this custom loss function ensures that the networks prioritize myocardial contractility patterns, merging precision with specificity.
Our study has some limitations. First, while there are minimal variations in the imaging parameters between cine and tagging sequences, it is important to note that both sequences capture the same spatial (slice) and temporal (timeframes) dimensions. This can potentially manifest as systematic biases in the strain evaluations determined by our developed models, thereby influencing the direct comparability of results derived from both sequences. Notably, the results illustrated that, in the context of Err strains, MyoNet’s interpretation of the anterior segments and ResMyoNet’s interpretation of the septal segments showed statistical differences when compared to tagging. This divergence might indicate that certain myocardial segments may be more susceptible to variations in the imaging sequences for different networks, or there might be differences between these specific segments that the networks process differently. Addressing such specific inconsistencies is important for the practical implementation of deep learning techniques in myocardial strain analysis. Another limitation is that our dataset was limited to 64 cases, each composed of 20 timeframes. While this dataset has been instrumental for drawing preliminary conclusions, the robustness and generalizability of the findings would undoubtedly benefit from a larger dataset. Another limitation of the proposed algorithms is that they focus on analyzing short-axis images, providing circumferential and radial strains. Future work includes generalizing algorithms to include analysis of long-axis slices. The demonstrated efficacy and insights provided by both MyoNet and ResMyoNet signify a promising step forward for pre-clinical cardiac imaging and strain analysis. Their capabilities, particularly in accurately interpreting spatiotemporal data, could offer researchers a more comprehensive and in-depth understanding of myocardial contractility patterns, potentially improving personalized diagnosis and treatment strategies. MyoNet provides superior quantitative metrics, while ResMyoNet offers a more efficient network. As these networks undergo further refinement and are validated on larger, more diverse datasets, translation to clinical practices becomes increasingly promising, marking a significant advancement in artificial intelligence (AI)-augmented cardiac care.
5. Conclusions
Our study demonstrates the significant advancements MyoNet brings to myocardial strain analysis in cine CMR, offering a notable improvement over the traditional tagging methods. Its innovative DL architecture enables precise and efficient analysis of myocardial strain directly from cine images, thereby reducing the need for tagged acquisitions and shortening acquisition times. This progress not only makes CMR more cost-effective in research and pre-clinical settings but also minimizes patient discomfort and scan times, expanding the utility of CMR in research applications. While ResMyoNet also contributes to the field in terms of training efficiency, MyoNet stands out for its comprehensive analysis capabilities. As MyoNet and ResMyoNet continue to evolve, validating them with comprehensive patient datasets will be crucial for their future clinical translation.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Kongbundansuk S. Hundley W.G. Noninvasive imaging of cardiovascular injury related to the treatment of cancer JACC Cardiovasc. Imaging 2014782483810.1016/j.jcmg.2014.06.00725124015 PMC 4183055 · doi ↗ · pubmed ↗
- 2Anthony F.Y. Ky B. Roadmap for biomarkers of cancer therapy cardiotoxicity Heart 20161024254302667498910.1136/heartjnl-2015-307894 PMC 4769926 · doi ↗ · pubmed ↗
- 3Henson K. Mc Gale P. Taylor C. Darby S. Radiation-related mortality from heart disease and lung cancer more than 20 years after radiotherapy for breast cancer Br. J. Cancer 201310817918210.1038/bjc.2012.57523257897 PMC 3553540 · doi ↗ · pubmed ↗
- 4Carlson L.E. Watt G.P. Tonorezos E.S. Chow E.J. Yu A.F. Woods M. Lynch C.F. John E.M. Mellemkjӕr L. Brooks J.D. Coronary artery disease in young women after radiation therapy for breast cancer: The WECARE study Cardio Oncol.2021338139210.1016/j.jaccao.2021.07.008PMC 846373134604798 · doi ↗ · pubmed ↗
- 5Darby S.C. Ewertz M. Mc Gale P. Bennet A.M. Blom-Goldman U. Brønnum D. Correa C. Cutter D. Gagliardi G. Gigante B. Risk of ischemic heart disease in women after radiotherapy for breast cancer N. Engl. J. Med.201336898799810.1056/NEJ Moa 120982523484825 · doi ↗ · pubmed ↗
- 6Erven K. Florian A. Slagmolen P. Sweldens C. Jurcut R. Wildiers H. Voigt J.-U. Weltens C. Subclinical cardiotoxicity detected by strain rate imaging up to 14 months after breast radiation therapy Int. J. Radiat. Oncol. Biol. Phys.2013851172117810.1016/j.ijrobp.2012.09.02223149005 · doi ↗ · pubmed ↗
- 7Ibrahim E.-S.H. Baruah D. Croisille P. Stojanovska J. Rubenstein J.C. Frei A. Schlaak R.A. Lin C.-Y. Pipke J.L. Lemke A. Cardiac magnetic resonance for early detection of radiation therapy-induced cardiotoxicity in a small animal model Cardio Oncol.2021311313010.1016/j.jaccao.2020.12.006PMC 807884633912843 · doi ↗ · pubmed ↗
- 8An D. Ibrahim E.-S. Elucidating Early Radiation-Induced Cardiotoxicity Markers in Preclinical Genetic Models Through Advanced Machine Learning and Cardiac MRIJ. Imaging 20241030810.3390/jimaging 1012030839728205 PMC 11677573 · doi ↗ · pubmed ↗
