GENRE-CMR: Generalizable Deep Learning for Diverse Multi-Domain Cardiac MRI Reconstruction

Kian Anvari Hamedani; Narges Razizadeh; Shahabedin Nabavi; Mohsen Ebrahimi Moghaddam

arXiv:2508.20600·eess.IV·October 30, 2025

GENRE-CMR: Generalizable Deep Learning for Diverse Multi-Domain Cardiac MRI Reconstruction

Kian Anvari Hamedani, Narges Razizadeh, Shahabedin Nabavi, Mohsen Ebrahimi Moghaddam

PDF

TL;DR

GENRE-CMR introduces a GAN-based deep learning framework with novel loss functions and residual unrolled architecture to improve cardiac MRI reconstruction quality and generalization across diverse acquisition settings.

Contribution

It presents a new generative adversarial network architecture with residual unrolled reconstruction and specialized loss functions for enhanced multi-domain CMR image reconstruction.

Findings

01

Outperforms state-of-the-art methods on unseen data

02

Achieves high SSIM and PSNR scores across various acceleration factors

03

Demonstrates robustness and generalization in diverse acquisition protocols

Abstract

Accelerated Cardiovascular Magnetic Resonance (CMR) image reconstruction remains a critical challenge due to the trade-off between scan time and image quality, particularly when generalizing across diverse acquisition settings. We propose GENRE-CMR, a generative adversarial network (GAN)-based architecture employing a residual deep unrolled reconstruction framework to enhance reconstruction fidelity and generalization. The architecture unrolls iterative optimization into a cascade of convolutional subnetworks, enriched with residual connections to enable progressive feature propagation from shallow to deeper stages. To further improve performance, we integrate two loss functions: (1) an Edge-Aware Region (EAR) loss, which guides the network to focus on structurally informative regions and helps prevent common reconstruction blurriness; and (2) a Statistical Distribution Alignment (SDA)…

Tables2

Table 1. Table 1: State-of-the-art MRI reconstruction approaches versus the proposed method. Metrics include SSIM, PSNR, and NMSE. The best values are shown in bold.

Method	Training Distributions			Unseen Distributions
Method	SSIM	PSNR	NMSE	SSIM	PSNR	NMSE
PromptMR [31]	0.9685	41.80	0.0129	0.9450	37.85	0.0198
SR-GAN [10]	0.9702	42.05	0.0120	0.9473	38.01	0.0191
PromptMR+ [11]	0.9728	42.40	0.0115	0.9498	38.22	0.0187
GENRE-CMR	0.9743	42.64	0.0111	0.9552	38.90	0.0160

Table 2. Table 2: Ablation study results demonstrating the impact of different components in the proposed method. Evaluation metrics include SSIM, PSNR, and NMSE, with the best scores highlighted in bold.

Ablation	Unseen Distributions
Ablation	SSIM	PSNR	NMSE
Baseline	0.9473	38.01	0.0191
Without SDA Loss	0.9500	38.25	0.0183
Without EAR Loss	0.9515	38.43	0.0178
Without Residual Connections	0.9523	38.21	0.0171
The Proposed Method	0.9552	38.90	0.0160

Equations12

L_{total} = λ_{1} L_{Fidelity} + λ_{2} L_{EAR} + λ_{3} L_{SDA}

L_{total} = λ_{1} L_{Fidelity} + λ_{2} L_{EAR} + λ_{3} L_{SDA}

M = G_{x}^{2} + G_{y}^{2}

M = G_{x}^{2} + G_{y}^{2}

B_{i, j} = {10 if M_{s} (i, j) \geq τ otherwise

B_{i, j} = {10 if M_{s} (i, j) \geq τ otherwise

L_{EAR} = 1 - SSIM (\tilde{I}_{rec}, \tilde{I}_{gt})

L_{EAR} = 1 - SSIM (\tilde{I}_{rec}, \tilde{I}_{gt})

L_{SDA}^{(l)} = \frac{1}{10} i < j \sum [D_{KL} (N (μ_{i}^{l}, Σ_{i}^{l}) ∥ N (μ_{j}^{l}, Σ_{j}^{l})) + D_{KL} (N (μ_{j}^{l}, Σ_{j}^{l}) ∥ N (μ_{i}^{l}, Σ_{i}^{l}))]

L_{SDA}^{(l)} = \frac{1}{10} i < j \sum [D_{KL} (N (μ_{i}^{l}, Σ_{i}^{l}) ∥ N (μ_{j}^{l}, Σ_{j}^{l})) + D_{KL} (N (μ_{j}^{l}, Σ_{j}^{l}) ∥ N (μ_{i}^{l}, Σ_{i}^{l}))]

L_{SDA} = l = 1 \sum 16 L_{SDA}^{(l)}

L_{SDA} = l = 1 \sum 16 L_{SDA}^{(l)}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

11institutetext: Faculty of Computer Science and Engineering, Shahid Beheshti University, Tehran, Iran

11email: {k.anvarihamedani,n.razizade}@mail.sbu.ac.ir

11email: {s_nabavi,m_moghadam}@sbu.ac.ir

https://github.com/kiananvari/GENRE-CMR

GENRE-CMR: Generalizable Deep Learning for Diverse Multi-Domain Cardiac MRI Reconstruction

Kian Anvari Hamedani

Narges Razizadeh

Shahabedin Nabavi

Mohsen Ebrahimi Moghaddam

Abstract

Accelerated Cardiovascular Magnetic Resonance (CMR) image reconstruction remains a critical challenge due to the trade-off between scan time and image quality, particularly when generalizing across diverse acquisition settings. We propose GENRE-CMR, a generative adversarial network (GAN)-based architecture employing a residual deep unrolled reconstruction framework to enhance reconstruction fidelity and generalization. The architecture unrolls iterative optimization into a cascade of convolutional subnetworks, enriched with residual connections to enable progressive feature propagation from shallow to deeper stages. To further improve performance, we integrate two loss functions: (1) an Edge-Aware Region (EAR) loss, which guides the network to focus on structurally informative regions and helps prevent common reconstruction blurriness; and (2) a Statistical Distribution Alignment (SDA) loss, which regularizes the feature space across diverse data distributions via a symmetric KL divergence formulation. Extensive experiments confirm that GENRE-CMR surpasses state-of-the-art methods on training and unseen data, achieving 0.9552 SSIM and 38.90 dB PSNR on unseen distributions across various acceleration factors and sampling trajectories. Ablation studies confirm the contribution of each proposed component to reconstruction quality and generalization. Our framework presents a unified and robust solution for high-quality CMR reconstruction, paving the way for clinically adaptable deployment across heterogeneous acquisition protocols.

1 Introduction

Cardiovascular Magnetic Resonance (CMR) imaging plays a pivotal role in non-invasive cardiovascular assessment, offering high-resolution, multiparametric information without ionising radiation [1]. It is considered the gold standard for evaluating cardiac function, perfusion, viability, fibrosis, and congenital abnormalities [2]. However, widespread clinical adoption remains constrained by the inherently slow acquisition process, driven by sequential k-space sampling and the need for high spatial and temporal resolution. This makes CMR sensitive to physiological motion (cardiac, respiratory), which can introduce artifacts and compromise image quality [3]. Recent studies have also investigated deformable registration and motion estimation strategies to better capture spatiotemporal cardiac dynamics [4, 5, 6]. Additionally, the complex cardiac anatomy and diverse imaging protocols, spanning various contrasts (Cine, T1/T2 Mapping, LGE), trajectories (e.g., uniform, 3D k-t Gaussian/Radial), and anatomical views (e.g., long-axis, short-axis, aortic), prolong scan time and increase reconstruction complexity [7, 8]. Differences in scanner hardware and software further hinder robust and consistent reconstruction across sites [9].

Recent advances in deep learning (DL) have shown promise in accelerating CMR image reconstruction by learning direct mappings from undersampled k-space to high-quality images [10, 11, 12, 13, 14]. Leading solutions from the CMRxRecon challenges [15, 8] have achieved superior performance over traditional methods [18, 19, 20], as reviewed in [16, 17]. However, DL models often fail to generalize well in clinical settings due to domain shifts arising from variability in acquisition protocols, scanner vendors, anatomical coverage, and patient characteristics [9]. Addressing this generalization gap, through robust domain adaptation and domain-robust reconstruction strategies, is critical for safe and reliable deployment of DL-based reconstructions in real-world clinical practice [21, 22].

Several recent works have explored domain-robust MRI reconstruction. One approach [23] demonstrated that networks trained with natural image datasets containing synthesized phase information can generalize across unseen contrasts and anatomies. A self-supervised method [24] introduced Robust SSDU, which is capable of recovering clean reconstructions even from noisy and sub-sampled training data. Another framework [25] proposed MRPD, which leverages large latent diffusion models pre-trained on natural images and adapts them to MRI reconstruction, demonstrating unprecedented cross-domain generalizability in unsupervised settings. More recently, LowRank-CGNet [26] was presented, integrating low-rank tensor modeling with conjugate gradient data consistency to handle diverse anatomy, contrast, and undersampling artifacts. While these approaches provide important advances, many rely on heavy diffusion models, external natural image priors, or structural assumptions, and do not jointly integrate unrolled optimization, adversarial learning, and explicit domain alignment.

To address these challenges, we propose a generalizable CMR reconstruction framework based on a generative adversarial network (GAN) architecture. The generator incorporates a residual deep unrolled network that mimics compressed sensing-based iterative optimization, progressively refining intermediate k-space estimates from undersampled input data. To improve anatomical fidelity and reduce blurring, we introduce Edge-Aware Reconstruction (EAR) loss that emphasizes recovery of clinically relevant boundaries. To address domain variability, we include a Statistical Distribution Alignment (SDA) loss that aligns latent features across different CMR domains. Furthermore, inspired by recent work in prompt-based MRI reconstruction [31], we integrate prompt learning to enable adaptive reconstruction across diverse contrasts, trajectories, and anatomical views within a unified model. Our main contributions are:

•

We present a generalizable residual deep unrolled reconstruction framework for CMR, which integrates compressed sensing-based inverse problem solving into the GAN generator. This design enables accurate reconstruction from highly undersampled k-space data. Experiments on the CMRxRecon 2025 dataset demonstrate that our method outperforms existing state-of-the-art approaches in both image quality and generalization across out-of-distribution scenarios.

•

We propose EAR loss, which enhances the reconstruction of fine anatomical details and mitigates the blurring artifacts common in deep learning-based methods.

•

To address the challenge of domain shift caused by variability in contrasts, trajectories, anatomical coverage, and scanner settings, we incorporate SDA loss that explicitly reduces distributional discrepancies across different CMR domains, promoting robust generalization.

2 Materials and Methods

2.1 Dataset

We evaluated our method on the CMRxRecon 2025 dataset [27], a large scale, multi-center, multi-vendor benchmark specifically designed to assess the robustness of cardiac MRI reconstruction models across clinically diverse settings. The dataset comprises over 600 subjects collected from multiple institutions and scanner vendors, including GE, Philips, Siemens, and UIH. It covers a broad patient population, including healthy volunteers, individuals with various cardiac pathologies such as cardiomyopathies, myocardial infarction, and arrhythmias, as well as pediatric cases. It features a wide range of CMR modalities (e.g., Cine, T1/T2 Mapping, LGE, and perfusion), along with varying sampling trajectories (Cartesian, Radial, and Gaussian) and magnetic field strengths (1.5T, 3T, and 5T). Each case includes fully sampled and under sampled k-space data, sampling masks, and ground truth reconstructions.

2.2 A Residual Deep Unrolled Reconstruction Framework

Building on our 2024 all-in-one Patch-GAN reconstruction model [10], this framework introduces residual cascaded unrolling with improved connectivity, while augmenting the original loss formulation with the proposed EAR and SDA terms. Specifically, we extend the earlier unrolled model by introducing residual connections between consecutive cascaded modules, enabling more effective feature propagation and mitigating vanishing gradients during training. The architecture consists of a generator and an unrolled discriminator trained adversarially. The generator follows a cascaded structure that unrolls an iterative reconstruction process, progressively refining the under-sampled k-space at each step. These residual paths directly pass feature maps from shallow reconstructors to deeper ones, encouraging the reuse of early-stage representations and promoting better convergence and reconstruction accuracy. The model accepts multi-coil k-space data from various vendors (Philips, Siemens, UIH, and GE), and different acquisition protocols, accounting for real-world domain shifts across centers. The network learns a sensitivity map and uses coil-combination layers to generate intermediate and final reconstructions. In addition, an unrolled discriminator is employed to enhance the reconstruction quality through adversarial training.

The complete procedure for our proposed GENRE-CMR framework is formalized in Algorithm 1, which integrates the residual unrolling, adversarial training, and the proposed EAR and SDA losses into a unified reconstruction pipeline.

The total loss combines three components: a data fidelity loss ( $\mathcal{L}_{\text{Fidelity}}$ ), an edge-aware reconstruction loss ( $\mathcal{L}_{\text{EAR}}$ ), and a statistical distribution alignment loss ( $\mathcal{L}_{\text{SDA}}$ ), each weighted by hyperparameters $\lambda_{1},\lambda_{2},$ and $\lambda_{3}$ , respectively. In our 2024 work [10], we combined physical k-space consistency and SSIM losses. In 2025, we extend this formulation by introducing $\mathcal{L}_{\text{EAR}}$ for sharper edge preservation and $\mathcal{L}_{\text{SDA}}$ for explicit cross-domain alignment, resulting in the composite loss defined in Eq. 1.

[TABLE]

The fidelity loss $\mathcal{L}_{\text{Fidelity}}$ corresponds to the total loss introduced in our previous work [10], which combines image-domain and physical k-space domain consistency terms to ensure high-quality reconstruction. To better preserve fine anatomical structures and prevent blurring of high-frequency details in cardiac MR images, we propose EAR loss. This loss focuses on the local regions around the edges, where diagnostic features are most critical, isolating edge information and comparing ground-truth and reconstructed images only within these regions. First, we extract horizontal and vertical gradients from the ground-truth image $I_{\text{gt}}$ using $3\times 3$ Sobel filters $S_{x}$ and $S_{y}$ , producing gradient maps $G_{x}=S_{x}*I_{\text{gt}}$ and $G_{y}=S_{y}*I_{\text{gt}}$ , where $*$ denotes convolution. The edge magnitude map is computed as

[TABLE]

To expand the influence area around edges, we convolve the edge map $M$ with a $5\times 5$ averaging kernel $A$ , resulting in a smoothed map $M_{s}=A*M$ . After thresholding $M_{s}$ at $\tau=0$ , a binary mask $B\in\{0,1\}^{H\times W}$ is generated as:

[TABLE]

This binary mask is then applied to both the reconstructed image $I_{\text{rec}}$ and the ground-truth $I_{\text{gt}}$ , producing masked versions containing the regions of edges: $\tilde{I}_{\text{rec}}=B\odot I_{\text{rec}}$ and $\tilde{I}_{\text{gt}}=B\odot I_{\text{gt}}$ , where $\odot$ denotes element-wise multiplication. Finally, the EAR loss is defined using the Structural Similarity Index (SSIM) loss between the masked images:

[TABLE]

This targeted design ensures that the model is penalized for structural degradation around edge regions, leading to sharper and more diagnostically valuable reconstructions. Illustration of the EAR Loss computations is shown in Figure 2.

The SDA loss encourages the intermediate feature representations of inputs from different domains to align in distribution. Suppose that the training data are drawn from five distinct distributions $\mathcal{D}_{1},\mathcal{D}_{2},\dots,\mathcal{D}_{5}$ , and are processed sequentially so that each group of five consecutive samples $\{x_{t}^{(1)},x_{t}^{(2)},\dots,x_{t}^{(5)}\}$ includes one data from each distribution. Each sample is passed through a residual deep unrolled network consisting of 16 reconstructors. For each sample $x_{t}^{(i)}$ , we extract feature vectors $\mathbf{f}_{t}^{(i,l)}\in\mathbb{R}^{d}$ from the output of reconstructor $l\in\{1,\dots,16\}$ . For each reconstructor, we compute a set $\mathcal{F}_{l}=\{\mathbf{f}_{t}^{(1,l)},\dots,\mathbf{f}_{t}^{(5,l)}\}$ , assumed to be samples from domain-specific feature distributions.

Assuming these are multivariate Gaussian distributions $\mathcal{N}(\mu_{i}^{l},\Sigma_{i}^{l})$ [28], we define SDA loss using symmetric KL-divergence:

[TABLE]

The total SDA loss is obtained by summing over all subnetwork layers:

[TABLE]

To maintain temporal domain diversity, a sliding window mechanism is used during training: after the initial 5 samples (one per domain), for each new input sample, the SDA loss is calculated by comparing it with the 4 most recent samples from the training sequence. This enforces local alignment of domain-specific features and promotes domain-invariant representation learning.

2.3 Implementation Details

In our approach, the generator and discriminator were optimized with AdamW (learning rate 0.002, weight decay 0.1, gradient clipping 0.1, step scheduler with step size 11 and gamma 0.1). The generator uses 16 reconstructor modules, 16 auto-calibration lines, and an adjacent k-space length of 5. We used acceleration factors of 8, 16, and 24 with k-t uniform, k-t Gaussian, and k-t radial trajectories. Curriculum learning [29] was applied by starting with lower acceleration factors, training for 20 epochs with batch size 1. Model performance was evaluated using SSIM, PSNR, and NMSE. To balance loss terms, we adopted Coefficient of Variation (CoV) weighting [30], which dynamically adjusts each loss weight ( $\lambda_{1}$ , $\lambda_{2}$ , $\lambda_{3}$ ) based on the ratio of standard deviation to mean, giving higher priority to more variable losses. The evolution of these weights is shown in Figure 3.

3 Experimental Results and Discussion

To evaluate the performance of our proposed method, we conducted extensive experiments using both seen (training) and unseen distributions. The quantitative results are summarized in Table 1, and the effectiveness of individual architectural components is examined through ablation studies in Table 2. A qualitative comparison of reconstructed images is also provided in Figure 4 under three distinct sampling trajectories at an acceleration factor of 16.

Table 1 presents a comparative evaluation of our method, GENRE-CMR, against several state-of-the-art approaches, including PromptMR [31], SR-GAN [10] and PromptMR+ [11]. Our model achieves the highest scores across all three metrics, SSIM, PSNR, and NMSE, in both training and unseen distributions. Specifically, GENRE-CMR attains a PSNR of 42.64 dB and SSIM of 0.9743 on training distributions, and maintains strong generalization with 38.90 dB PSNR and 0.9552 SSIM on unseen data. These results indicate that GENRE-CMR learns effective representations from the training data while maintaining robust performance on unseen data distributions. The consistent reduction in NMSE further demonstrates the model’s ability to preserve fine image details and suppress reconstruction errors.

As illustrated in Figure 4, the proposed method provides visually superior reconstructions compared to existing techniques under various undersampling trajectories. The reconstructed images by GENRE-CMR show sharper anatomical boundaries and reduced aliasing artifacts, particularly in challenging regions such as myocardial borders and small vessels. This visual fidelity underscores the benefit of integrating residual unrolling with edge-aware reconstruction.

To assess the contribution of individual components in our architecture, we performed a systematic ablation study. Table 2 reports the performance on unseen distributions when certain modules are removed from the full pipeline. Removing the SDA loss resulted in a noticeable performance drop, confirming its role in improving cross-distribution generalization. Similarly, eliminating the EAR loss led to reduced SSIM and PSNR, indicating its effectiveness in enhancing edge preservation during reconstruction. Disabling residual connections between cascaded subnetworks also caused a decline in performance, confirming that feature propagation across depths is crucial for effective hierarchical representation learning. In the full configuration, the proposed method outperforms all ablated variants, achieving the highest SSIM (0.9552), PSNR (38.90), and the lowest NMSE (0.0160). This demonstrates the synergistic benefit of combining residual connections, EAR loss, and SDA loss in our unrolled reconstruction framework. Compared to our 2024 all-in-one model [10], the proposed GENRE-CMR consistently outperforms it under both in-distribution and out-of-distribution evaluation. Ablation studies confirm that $\mathcal{L}_{\text{EAR}}$ and $\mathcal{L}_{\text{SDA}}$ , absent in the 2024 version, are responsible for substantial gains in edge fidelity and domain robustness.

4 Conclusion

In this work, we proposed a generalizable deep learning framework for CMR reconstruction, built upon a GAN architecture. The generator is driven by a residual deep unrolled architecture that mimics iterative optimization steps, incorporating compressed sensing concepts to effectively address the underlying inverse problem. To enhance the preservation of clinically relevant anatomical structures, we introduced the EAR loss, which explicitly promotes sharper boundary reconstruction and reduces common blurring artifacts. We integrated the SDA loss to improve robustness across diverse distributions to address domain shift challenges that arise due to differences in imaging centers and devices, image contrast, sampling patterns, anatomical variability, and acquisition protocols. Comprehensive experiments and ablation studies on the CMRxRecon 2025 dataset demonstrated that our method consistently outperforms state-of-the-art approaches.

While effective, the framework can be computationally demanding during training, requiring powerful GPUs to achieve optimal performance. This is a common challenge in advanced deep learning methods, and future work may explore more efficient architectures or training strategies to reduce resource requirements. As an important future direction, we plan to conduct clinical evaluations with expert radiologists to assess diagnostic accuracy and real-world applicability, moving closer to clinical integration.

Bibliography31

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Arnold, J. & Mc Cann, G. Cardiovascular magnetic resonance: applications and practical considerations for the general cardiologist. Heart . 106 , 174-181 (2020)
2[2] Vasquez, M. & Nagel, E. Clinical indications for cardiovascular magnetic resonance. Heart . 105 , 1755-1762 (2019)
3[3] Nabavi, S., Simchi, H., Moghaddam, M., Abin, A. & Frangi, A. A generalised deep meta-learning model for automated quality control of cardiovascular magnetic resonance images. Computer Methods And Programs In Biomedicine . 242 pp. 107770 (2023)
4[4] Zakeri, A., Hokmabadi, A., Bi, N., Wijesinghe, I., Nix, M., Petersen, S., Frangi, A., Taylor, Z. & Gooya, A. Drag Net: Learning-based deformable registration for realistic cardiac MR sequence generation from a single frame. Medical Image Analysis . 83 pp. 102678 (2023)
5[5] Bi, N., Zakeri, A., Xia, Y., Cheng, N., Taylor, Z., Frangi, A. & Gooya, A. Seg Morph: concurrent motion estimation and segmentation for cardiac MRI sequences. IEEE Transactions On Medical Imaging . (2024)
6[6] Kebriti, S., Nabavi, S. & Gooya, A. Fract Morph: A Fractional Fourier-Based Multi-Domain Transformer for Deformable Image Registration. Ar Xiv Preprint Ar Xiv:2508.12445 . (2025)
7[7] Enders, J., Zimmermann, E., Rief, M., Martus, P., Klingebiel, R., Asbach, P., Klessen, C., Diederichs, G., Bengner, T., Teichgräber, U. & Others Reduction of claustrophobia during magnetic resonance imaging: methods and design of the" CLAUSTRO" randomized controlled trial. BMC Medical Imaging . 11 pp. 1-15 (2011)
8[8] Wang, Z., Wang, F., Qin, C., Lyu, J., Ouyang, C., Wang, S., Li, Y., Yu, M., Zhang, H., Guo, K. & Others Cmrxrecon 2024: A multimodality, multiview k-space dataset boosting universal machine learning for accelerated cardiac mri. Radiology: Artificial Intelligence . 7 , e 240443 (2025)