TL;DR
This paper introduces a scalable, learning-based method for optimizing sampling masks in dynamic MRI, significantly reducing computational costs while maintaining high-quality image reconstruction from undersampled data.
Contribution
It presents a novel stochastic greedy algorithm for designing optimal sampling masks, addressing scalability issues in dynamic MRI compressed sensing.
Findings
Reduces computational burden by nearly 200 times.
Maintains reconstruction performance comparable to existing methods.
Provides a deterministic optimal sampling mask solution.
Abstract
Compressed sensing applied to magnetic resonance imaging (MRI) allows to reduce the scanning time by enabling images to be reconstructed from highly undersampled data. In this paper, we tackle the problem of designing a sampling mask for an arbitrary reconstruction method and a limited acquisition budget. Namely, we look for an optimal probability distribution from which a mask with a fixed cardinality is drawn. We demonstrate that this problem admits a compactly supported solution, which leads to a deterministic optimal sampling mask. We then propose a stochastic greedy algorithm that (i) provides an approximate solution to this problem, and (ii) resolves the scaling issues of [1,2]. We validate its performance on in vivo dynamic MRI with retrospective undersampling, showing that our method preserves the performance of [1,2] while reducing the computational burden by a factor close to…
| Algorithm | Setting | G-v1 | SG-v1 | SG-v2 | |||||
|---|---|---|---|---|---|---|---|---|---|
| Time | Time | Speedup | Time | Speedup | |||||
| KTF | 152152173 | ||||||||
| 256256102 | |||||||||
| IST | 152152173 | ||||||||
| ALOHA | 152152173 | ||||||||
| Algo. | Setting | Time | ||
|---|---|---|---|---|
| KTF | 152152173 | 1200 | 38 | h |
| 256256102∗ | 2400 | 64 | h | |
| IST | 152152173 | 1200 | 38 | h |
| ALOHA | 152152173 | 1200 | 38 | d h |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
\newcites
apndxReferences
Scalable Learning-Based Sampling Optimization for
Compressive Dynamic MRI
Abstract
Compressed sensing applied to magnetic resonance imaging (MRI) allows to reduce the scanning time by enabling images to be reconstructed from highly undersampled data. In this paper, we tackle the problem of designing a sampling mask for an arbitrary reconstruction method and a limited acquisition budget. Namely, we look for an optimal probability distribution from which a mask with a fixed cardinality is drawn. We demonstrate that this problem admits a compactly supported solution, which leads to a deterministic optimal sampling mask. We then propose a stochastic greedy algorithm that (i) provides an approximate solution to this problem, and (ii) resolves the scaling issues of [1, 2]. We validate its performance on in vivo dynamic MRI with retrospective undersampling, showing that our method preserves the performance of [1, 2] while reducing the computational burden by a factor close to 200. Our implementation is available at https://github.com/t-sanchez/stochasticGreedyMRI.
Index Terms— Magnetic resonance imaging, compressive sensing (CS), learning-based sampling.
1 Introduction
Dynamic Magnetic Resonance Imaging (dMRI) is a powerful tool in medical imaging, which allows for non-invasive monitoring of tissues over time. A main challenge to the quality of dMRI examinations is the inefficiency of data acquisition that limits temporal and spatial resolutions. In the presence of moving tissues, such as in cardiac MRI, the trade-off between spatial and temporal resolution is further complicated by the need to perform breath-holds to minimize motion artifacts [3].
In the last decade, the rise of Compressed Sensing (CS) has significantly contributed to overcoming these problems. CS allows for a successful reconstruction from undersampled measurements, provided that they are incoherent [4, 5] and that the data can be sparsely represented in some domain. In dMRI, samples are acquired in the - space (spatial frequency and time domain), and can be sparsely represented in the - domain (image and temporal Fourier transform domain). Many algorithms have exploited this framework with great success (see [6, 7, 8, 9, 10, 11, 12, 13, 14] and the references therein).
While CS theory mostly focuses on fully random measurements [15], the practical implementations have generally exploited random variable-density sampling, based on drawing random samples from a parametric distribution (typically polynomial or Gaussian) which reasonably imitates the energy distribution in the - space [16, 17]. While all these approaches allow to quickly design masks which yield a great improvement over fully random sampling, prescribed by the theory of CS, they (i) remain largely heuristic; (ii) ignore the anatomy of interest; (iii) ignore the reconstruction algorithm; (iv) require careful tuning of their various parameters, and (v) do not necessarily use a fixed number of readouts per frame.
In the present work, we show that the problem of finding an optimal mask sampling distribution which contains out of possible locations admits a solution compactly supported on elements. This demonstrates that our previously proposed framework in [1, 2], which searches for an approximately optimal sampling mask, is in fact looking for a solution to the more general problem of finding an optimal measurement distribution. In addition, we propose a scalable learning-based framework for dMRI. Our proposed stochastic greedy method preserves the performance of [1, 2] while reducing the computational burden by a factor close to .
Numerical evidence shows that our framework can successfully find sampling patterns for a broad range of decoders, from k-t FOCUSS [7] to ALOHA [13], outperforming state-of-the-art model-based sampling methods over nearly all sampling rates considered.
2 Theory
2.1 Signal Acquisition
In the compressed sensing (CS) problem [5], one desires to retrieve a signal that is known to be sparse in some basis using only a small number of linear measurements. In the case of dynamic MRI, we consider a signal (i.e. a vectorized video of size with frames), and the subsampled Fourier measurements are \useshortskip
[TABLE]
where is the spatial Fourier transform operator applied to the vectorized signal, is a subsampling operator that selects the rows of according to the indices in the set with and . We refer to as sampling pattern or mask. We assume the signal to be sparse in the basis , which typically is a temporal Fourier transform across frames. Given the samples , along with , a reconstruction algorithm or decoder forms an estimate of .
The quality of the reconstruction is then evaluated using a performance metric , which could typically include Peak Signal-to-Noise Ratio (PSNR), the negative Mean Square Error (MSE), or the Structural Similarity Index Measure (SSIM) [18].
2.2 Sampling mask design
We model the mask designing process as finding a probability mass function (PMF) , where is the standard simplex in . assigns to each location in the -space a probability to be acquired. The mask is then constructed by drawing without replacement from until the cardinality constraint is met. The problem of finding the optimal sampling distribution is subsequently formulated as \useshortskip
[TABLE]
where the index set is generated from and . This problem corresponds to finding the probability distribution that maximizes the expected performance metric with respect to the data and the masks drawn from this distribution. To ease the notation, we will use .
In practice, we do not have access to and instead have at hand the training images drawn independently from . We therefore maximize the empirical perfromance by solving \useshortskip
[TABLE]
\useshortskip
Given that Problem (3) looks for masks that are constructed by sampling times without replacement from , the following holds.
Proposition 1**.**
There exists a maximizer of Problem (3) that is supported on an index set of size at most .
Proof.
Let the distribution be a maximizer of Problem (3). We are interested in finding the support of . Because , note that
[TABLE]
Let be an index set of size that maximizes the last line above. The above holds with equality when and for and . This in turn happens when is supported on . That is, there exists a maximizer of Problem (3) that is supported on an index set of size . ∎
While this observation does not indicate how to find this maximizer, it nonetheless allows us to further simplify Problem (3). More specifically, the observation that a distribution has a compact support of size implies the following:
Proposition 2**.**
\useshortskip
[TABLE]
Proof.
Proposition 1 tells us that a solution of Problem (3) is supported on a set of size at most , which implies
[TABLE]
That is, we only need to search over compactly supported distributions . Let denote the standard simplex on a support . It holds that
[TABLE]
To obtain the second and third equalities, one observes that all masks have a common support with elements, i.e. allows only for a single mask with elements, namely . ∎
The framework of Problem (3) captures most variable-density based approaches of the literature that are defined in a data-driven fashion [19, 20, 21, 22, 23, 24, 25], and Proposition 5 shows that Problem (7), that we tackled in [1, 2] and develop here, also aims at solving the same problem as these probabilistic approaches. Note that while the present theory considered sampling points in the Fourier space, it is readily applicable to the Cartesian case, where full lines are added to the mask at once.
3 Stochastic greedy mask design
Aligned with the approach that we previously proposed in [1], we want to find an approximate solution to Problem (5) by leveraging a greedy algorithm. This is required by Problem (5) being inherently combinatorial. The previous greedy method of [1, 2] suffers from three main drawbacks: (i) it scales quadratically with the total number of lines, (ii) it scales linearly with the size of the dataset, and (iii) it does not construct mask with a fixed number of readouts by frame. While [2] partially deals with (i), our proposed stochastic greedy approach addresses all three issues, while preserving the benefits of [1]. It notably still preserves the nestedness and ordering of the acquisition, where critical locations are acquired initially, and the mask built outputs a nested structure (i.e. the mask at sampling rate includes all sampling locations of the mask at ).
Let us introduce the set of all lines that can be acquired, which is a set of subsets of . A feasible Cartesian mask takes the form , i.e. it consists of a union of lines. Both the greedy method of [1] and our stochastic method are detailed in Algorithm 1 below. Our stochastic greedy method (SG-v2) addresses the three main limitations of the greedy method of [1] (G-v1). The issue (i) is solved by picking uniformly at random at each iteration a batch possible lines of size from a given frame , instead of considering the full set of possible lines (line 3 in Alg. 1); (ii) is addressed by considering a fixed batch of training data of size instead of the whole training set of size at each iteration (line 4 in Alg. 1); (iii) is solved by iterating through the lines to be added from each frame sequentially (lines 1, 3 and 10 in Alg. 1). These improvements are inspired by the refinements done to the standard greedy algorithm in the field of submodular optimization [26], and allow to move the computational complexity from to , effectively speeding up the computation by a factor . Our results show that this is achieved without sacrificing any reconstruction quality.
4 Numerical Experiments
4.1 Implementation details
Reconstruction algorithms: We consider three reconstruction algorithms, namely - FOCUSS (KTF) [7], and ALOHA [13]. Their parameters were selected to maintain a good empirical performance across all sampling rates considered.
Mask selection baselines:
- •
Coherence-VD [16]: We consider a random variable-density sampling mask with Gaussian density and optimize its parameters to minimize coherence.
- •
LB-VD [1, 2]: Instead of minimizing the coherence as in Coherence-VD, we perform a grid search on the parameters using the training set to optimize reconstruction according to the same performance metric as our method.
Data sets: Our dynamic data were acquired in seven adult volunteers with a balanced steady-state free precession (bSSFP) pulse sequence on a whole-body Siemens 3T scanner using a 34-element matrix coil array. Several short-axis cine images were obtained during a breath-hold scan. Fully sampled Cartesian data were acquired using a grid with frames, then combined and cropped to a single coil image. The details of the parameters used are provided in the supplementary material [27]. In the experiments, we used three volumes for training and four for testing.
4.2 Comparison of greedy algorithms
We first compare the performance of G-v1 with SG-v1 and SG-v2, and show the results on Figure 1. We are specifically interested in determining the sensitivity of our algorithm to the sampling batch size and training batch size (for SG-v2, we use unless stated differently). We see that using a small batch size (e.g. ) yields a drop in performance, while even improves performance compared to G-v1, with respectively times less computation for SG-v1 and less computations for SG-v2. One should also note that using a batch of training images (SG-v2) does not reduce the performance compared to SG-v1, while largely reducing computations. Also, additional results (in the supplementary material [27]) show that using larger batches yields similar results as for . The fact that the performance of SG-v2 with outperforms G-v1 could be surprising, but originates in the lack of structure of the problem, where introducing noise in the computations through random batches of samples improves the overall performance of the method. In the sequel, we use and for SCG-v2.
4.3 Single coil results
The comparison to baselines is shown on Figures 2 and 3, where we see that the SG-v2 method yields masks that consistently improve the results compared to all variable-density methods used.
We notice in Figure 3 that comparing the reconstruction algorithms with VD methods do not allow for a faithful performance comparison of the reconstruction algorithms: the performance difference is very small between the reconstruction methods. In contrast, considering the reconstruction algorithm jointly with a sampling pattern optimized with our model-free approach makes the performance difference much more noticeable: ALOHA with its corresponding mask clearly outperforms KTF, and this conclusion could not be made by looking solely at reconstructions with VD-based masks. Note that extended results, along with multi-coil experiments, are available in our supplementary material [27].
4.4 Large scale static results
This last experiment shows the scalability of our method to very large datasets. We used the fastMRI dataset [28] consisting of knee volumes and trained the mask for reconstructing the most central slices of size , which yielded a training set containing slices. For the sake of brevity, we only report computations performed using total variation (TV) minimization with NESTA [29]. For mask design, we used the SG-v2 method with and (2500 fewer computations compared to G-v1). The LB-VD method was trained using representative slices and optimizing the parameters with a similar computational budget as SG-v2. The result on Figure 4 shows a uniform improvement of our method over the LB-VD approach.
5 Discussion and Conclusion
We presented a scalable sampling optimization method for dMRI, which largely addresses the scalability issues of [1, 2]. Reducing the resources used by G-v1 by as much as a times was shown to have no negative impact on the quality of reconstruction achieved within our framework. Our method was demonstrated to successfully scale to very large datasets such as fastMRI [28], which the previous greedy method [1] could not achieve.
The masks obtained bring significant image quality improvements over the baselines. The results suggest that VD-based methods limit the performance of CS applied to MRI through their underlying model. They are consistently outperformed by our model-free and data-adaptive method on different in vivo datasets, across several decoders, field of views and resolutions. Our findings highlight that sampling design should not be considered in isolation from data and reconstruction algorithm, as using a mask that is not specifically optimized can considerably hinder the performance of the algorithm.
More importantly, our theoretical results show that the generic non-convex Problem (3) aiming at finding a probability mass function under a cardinality constraint from which a mask is subsequently sampled, is equivalent to the discrete Problem (7) of looking for the support of this PMF. This connection opens the door to rigorously leveraging techniques from combinatorial optimization for the problem of designing optimal, data-driven sampling masks for MRI.
Appendix A Detailed description of the datasets
Cardiac dataset. The data set was acquired in seven healthy adult volunteers with a balanced steady-state free precession (bSSFP) pulse sequence on a whole-body Siemens 3T scanner using a 34-element matrix coil array. Several short-axis cine images were acquired during a breath-hold scan. Fully sampled Cartesian data were acquired using a grid, with relevant imaging parameters including field of view (FoV), slice thickness, spatial resolution, temporal resolution, TE/TR, [math] flip angle, /px readout bandwidth. There were phase encodes acquired for a frame during one heartbeat, for a total of frames after the scan.
The Cartesian cardiac scans were then combined to single coil data from the initial size, using adaptive coil combination \citeapndxwalsh2000adaptive, griswold2002use, which keeps the image complex. This single coil image was then cropped to a image. This is done because a large portion of the periphery of the images are static or void, and also to enable a greater computational efficiency.
Vocal dataset. The vocal dataset that we used in the experiments F comprised vocal tract scans with a 2D HASTE sequence (T2 weighted single-shot turbo spin-echo) on a 3T Siemens Tim Trio using a 4-channel body matrix coil array. The study was approved by the local institutional review board, and informed consent was obtained from all subjects prior to imaging. Fully sampled Cartesian data were acquired using a grid, with field of view (FoV), slice thickness, spatial resolution, TE/TR, [math] flip angle, /px readout bandwidth, echo spacing ( turbo factor). There was a total of frames acquired, which were recombined to single coil data using adaptive coil combination as well \citeapndxwalsh2000adaptive, griswold2002use.
fastMRI. The fastMRI dataset was obtained from the NYU fastMRI initiative [28]. The anonymized dataset comprises raw k-space data from more than 1,500 fully sampled knee MRIs obtained on 3 and 1.5 Tesla magnets. The dataset includes coronal proton density-weighted images with and without fat suppression.
Appendix B Extended literature review
The most widely used approach for the design of the sampling pattern is random variable-density sampling, which was originally proposed by Lustig et al. [16] for static MRI and adapted to dynamic MRI by Jung et al. [17]. It offers a compromise between incoherent measurements, required by the theory of CS, and the structure that can be found in the k-space, where most of the energy is concentrated in the low frequency end of the spectrum. This classical approach draws random samples according to a parametric distribution mimicking the energy distribution of the k-space, favoring low-frequency samples. The distribution considered is typically either polynomial [16, 22] \citeapndxkim2012accelerated,tremoulheac2014dynamic, or Gaussian [7, 8, 11, 12, 13, 14]. In these setups, a slight offset is often added in order to prevent the distribution from having extremely small probabilities at high-frequencies, and a few low-frequency k-space samples are acquired at the Nyquist rate.
The variable-density based methods commonly used in dMRI perform well, but have several weaknesses, already highlighted in [1] for static MRI. They require parameters to be tuned, such as decay rate of the polynomial, the standard deviation of the Gaussian distribution or the number of central phase encodes and arbitrarily constrain the sampling patterns to a model without any theoretical justification. Moreover, it is unclear which sampling density will be most effective for a given anatomy and reconstruction rule. Also, the idea of randomizing the acquisition is in itself questionable, as in practice, one would desire to design a fixed sampling pattern that we will know to perform well for a specific anatomy across many subjects. Finally, some variable-density methods, such as Poisson Disc Sampling \citeapndxvasanawala2011practical, do not use a fixed number of readouts per frame, which complicates their hardware implementation for dynamic MRI \citeapndxahmad2015variable. Indeed, undersampling some frames more heavily than others might result in missing critical temporal information.
Recently, several articles have focused on improved design of spatiotemporal sampling patterns for dMRI, and we hereafter detail two particularly relevant methods. A recent method devised for this purpose is the variable density incoherent spatiotemporal acquisition (VISTA) \citeapndxahmad2015variable that maximizes Riesz energy on a spatiotemporal grid, and has the notable advantage of generating patterns with high levels of incoherence, and maintaining uniform sampling density across frames. Another important technique proposed by Li et al. \citeapndxli2018dynamic develops a method for Cartesian sampling exploiting the golden-ratio, with the aim to generate incoherent measurements and maintain uniform sampling density across frames111This approach is different from the commonly used golden-angle sampling used in radial sampling..
Other relevant undersampling works include, in the non-Cartesian setting, fully random radial sampling \citeapndxjung2010radial, tremoulheac2014dynamic, as well as golden-angle radial sampling, where spokes separated by the golden-angle are continuously acquired \citeapndxwinkelmann2007optimal, feng2014golden,feng2016xd. These results exploit the inherent advantage of radial over Cartesian sampling that each spoke goes through the sample of the k-space and can thus contain low-frequency as well as high-frequency information. More recent work also leverage variable-density approaches in the non-Cartesian setting \citeapndxboyer2016generation,lazarus2018variable Also, in static MRI, several methods exploiting training signals have been proposed: in \citeapndxknoll2011adapted, zhang2014energy,vellagoundar2015robust, a distribution from which random samples are drawn is constructed, and in \citeapndxseeger2010optimization,ravishankar2011adaptive,liu2012under, haldar2019oedipus, a single image is used at a time to determine the sampling mask. Very recently, deep-learning based methods have enabled active mask design paired with online reconstruction and shown very promising results \citeapndxjin2019self,zhang2019reducing,weiss2019learning. However, to the best of our knowledge, none of these methods have been extended to dynamic MRI.
Appendix C Influence of the batch size on the mask design
In this appendix, we discuss the tuning of the batch size used in SG-v1, to specifically study the effect of different batch sizes. We ran SG-v1 with different batch sizes in the same settings are in the numerical experiment of section 4.3 and report on Figure 5 the PSNR of the reconstructions for SG-v1. We only considered KTF for brevity. We see that very small batch sizes yield poor results, and the PSNR reaches the result from G-v1 with as few as 38 samples (out of samples overall). Unless then the batch size is extremely small (less than to of all phase encoding lines at each greedy iteration), the results suggest that the masks obtained with SG-v1 or SG-v2 yield satisfactory reconstruction quality, i.e. the same quality as G-v1 or even an increase.
The Figure 6 shows the different masks obtained for the batch sizes considered, several observations can be made. First of all, as expected, taking a batch size of yields a totally random mask, and taking a batch size of yields a mask that is more centered towards low frequency than the one with but it still has a large variance. Then, as the batch size increases, resulting masks seem to converge to very similar designs, but those are slightly different from the ones obtained with G-v1.
Appendix D Computational costs
We report here the computational costs for the different variations of the greedy methods used in the single coil experiment 4.3 as well as the computational costs for the Appendix F. Table 1 provides the running times and empirically measured speedup for the greedy variation, and Table 2 provides the computational times required to obtain the learning-based variable density (LB-VD) parameters through an extensive grid-search. The empirical speedup is computed as
[TABLE]
The main point of these tables is to show that the computational improvement is very significant in terms of resources, and that our approach improves greatly the efficiency of the method of [1]. This ratio might differ from the predicted speedup factor of due to computational considerations. Table 1 shows that we have roughly a factor between the predicted and the measured speedup, mainly due to the communication between the multiple processes as well as I/O operations.
Appendix E Multicoil experiments
For the multicoil experiment, we used the previously described cardiac dataset but we did not crop the images. We took the first frames for all subjects, and selected coils that cover the region of interest. Each image was then normalized in order for the resulting sum-of-squares image to have at most unit intensity. When required, the coil sensitivities were self-calibrated according to the idea proposed in \citeapndxfeng2013highly, which averages the signal acquired over time in the k-space and subsequently performs adaptive coil combination \citeapndxwalsh2000adaptive,griswold2002use.
The advantage of using self-calibration is that the greedy optimization procedure can simultaneously take into account the need for accurate coil estimation as well as accurate reconstruction, thus potentially eliminating the need for a calibration scan prior to the acquisition. A more complete discussion of the accuracy of self-calibrated coil sensitivities is presented in \citeapndxfeng2013highly.
We used - SPARSE-SENSE \citeapndxotazo2010combination and ALOHA [13] for reconstruction. While the first requires coil sensitivities, the second reconstructs the images directly in k-space before combining the reconstructed data. We also introduce an additional mask designing baseline, namely golden ratio Cartesian sampling \citeapndxli2018dynamic that we will use in the sequel. We will refer to it as golden.
Appendix F Additional single-coil results with SG-v1
While the main paper focused on SG-v2, using a batch of training samples instead of the whole training set, we focus here on results with SG-v1. SG-v1 accelerated G-v1 by a factor , and we contend that due to the small dataset used in our case, using a batch of training data instead of the whole set should not affect the performance.
F.1 Comparison to baselines
The comparison to baselines is shown on Figures 2 and 3, where we see that the learning-based method yields masks which consistently improve the results compared to all variable-density methods used. Even though some variable-density techniques are able to provide good results for some sampling rates and algorithms, our learning-based technique is able to consistently provide improvement over this baseline. Compared to Coherence-VD, there is always at least dB improvement at any sampling rate, and it can be as much as dB at sampling rate for ALOHA. For golden, there is an improvement larger than dB prior to rate, and around dB after for all decoders. Figure 2 also clearly indicates that the benefits of our learning-based framework become more apparent towards higher sampling rates, where the performance improvement over LB-VD reaches up to dB. Towards lower sampling rates, with much fewer degrees of freedom for mask design, the greedy method and LB-VD yield similar performance as expected. As shown in Figure 3, the learning-based masks tend to conserve better the sharp contrast transition compared to the variable-density techniques.
F.2 Cross-performances of performance measures
Up to here, we used PSNR as the performance measure, and we now compare it with the results of the greedy algorithm paired with SSIM, a metric that more closely reflect perceptual similarity. For brevity, we only consider ALOHA in this section. In the case where we optimized for SSIM, we noticed that unless a low-frequency initial mask is given, the reconstruction quality would mostly stagnate. This is why we chose to start the greedy algorithm with low-frequency phase encodes at each frame in the SSIM case.
The reconstructions for PSNR and SSIM are shown on Figure 10, where we see that the learning-based masks outperform the baselines across all sampling rates except at in the SSIM case. The quality of the results is very close for both masks, but each tends to perform slightly better with the performance metric for which it was trained. The fact that the ALOHA-SSIM result at has a very low SSIM is due to the fact that we impose phase encodes across all frames, and the resulting sampling mask at is a low pass mask in this case.
A visual reconstruction is provided in Figure 10, we see that there is almost no difference in reconstruction quality, and that the masks remain very similar. Overall, we observe in this case that the performance metric selection does not have a dramatic effect on the quality of reconstruction, and our greedy framework is still able to produce masks that outperform the baselines when optimizing SSIM instead of PSNR.
F.3 Experiments with different anatomies
In these last experiments, we consider both the single coil cardiac dataset as well as the vocal imaging dataset both of size . The cardiac dataset was trained on samples and tested on , using only the first ten frames of each scan, whereas the vocal one used training samples and testing samples. In this setup, the k-space of the cardiac dataset tends to vary more from one sample to another than the vocal one, making the generalization of the mask more complicated. This issue would require more training samples, but imposing SG-v1 algorithm to start with central phase encoding lines on each frame was found to be sufficient to acquire the peaks in the k-space across the whole dataset. SGv1-Cardiac refers to the greedy algorithm using cardiac data, and SGv1-Vocal is its vocal counterpart. The algorithm used a batch of size at each iteration, and the results were obtained using only KTF.
The results are reported on the Figures 11 and 12, and we see that, for the both datasets, the greedy approach provides superior results against VD sampling methods across all sampling rates. It is striking that, in this setting, the SG-v1 approach outperforms even more convincingly all the baselines, and the LB-VD approach, in this case, is outperformed by more than dB by SG-v1, where it remained very competitive in the other settings. This difference is clear in the temporal fidelity of both reconstructions on Figure 12, where we see that the LB-VD approach loses sharpness and accuracy compared to SG-v1.
F.4 Comparison across anatomies
The main complication coming from applying the masks across anatomies is that the form of the k-space might vary heavily across datasets: the vocal spectrum is very sharply peaked, while the cardiac one is much broader. Comparing the cross-performances on Figures 12, we see that the and SGv1-vocal masks generalizes much better on the cardiac datasets than the other way around. This can be explained from the differences in the spectra: the cardiac one being more spread out, the cardiac mask less faithfully captures the very low frequencies of the k-space, which are absolutely crucial to a successful reconstruction on the vocal dataset, thus hindering the reconstruction quality. Also, we see that it is important for the trained mask to be paired with its anatomy to obtain the best performance.
F.5 Additional visual reconstructions for cardiac and vocal dataset
The present appendix provides further results for experiments F.3 and F.4. We show in Figures 14 and 14 reconstruction at different frames which provide clearer visual information to the quality of reconstruction compared to the temporal profiles.
For these images, the PSNR and SSIM are computed with respect to each individual frame, showing the quality of the reconstruction in a much more detailed fashion than before, where we considered each dynamic scan as a whole. Generally, we as previously observed, the mask trained for a specific anatomy will most faithfully capture the sharp contrast transitions in the dynamic regions of the images. For the vocal images, we see that sampling the first frame more heavily is important in order to avoid having a very large PSNR discrepancy, as observed for the other masks. The PSNR remains quite stable across the frames otherwise.
F.6 Noisy experiments
In order to test the robustness of our framework to noise, we artificially added bivariate circularly symmetry complex random Gaussian noise to the normalized complex images, with a standard deviation for both the real and imaginary components. We then tested to see whether the greedy framework is able to adapt to the level of noise by prescribing a different sampling pattern than in the previous experiments.
We chose to use V-BM4D \citeapndxmaggioni2012video as denoiser with its default suggested mode using Wiener filtering and low-complexity profile, and provided the algorithm the standard deviation of the noise as the denoising parameter. The comparison between the fully sampled denoised images and the original ones yields an average PSNR of dB across the whole dataset. Due to the fact that none of the reconstruction algorithms that we used have a denoising parameter incorporated, we simply apply the V-BM4D respectively to the real and the imaginary parts of the result of the reconstruction. The results that we obtain are presented on the Figures 16 and 16.
It is interesting to notice on Figure 16 that the learning-based framework outperforms the baselines that are not learning-based by a larger margin than in the noiseless case, and this is again especially true at low sampling rates. In this case however, the difference between SG-v1 and LB-VD methods is much smaller, and this might be explained by the fact that noise corrupts the high frequency samples, and thus the masks concentrate more around low-frequencies, leaving less room for designs that largely differ.
We see a clear adaptation of the resulting learning based mask, as shown by comparing Figures 3 and 16: the masks SGv1-KTF and SGv1-ALOHA, which are trained on the noisy data, are closer to low-pass masks, due to the high-frequency details being lost to noise, and hence, no very high frequency samples are added to the mask.
Also, notice than even if the discrepancy in PSNR is only around dB between the golden ratio sampling and the optimized one, the temporal details are much more faithfully preserved by the learning-based approach, which is crucial in dynamic applications. The inadequacy of coherence-based sampling is highlighted in this case, as very little temporal information is captured in the reconstruction with both decoders. Also, for both decoders, there is a clear improvement on the preservation of the temporal profile when using learning-based masks compared to the baselines; the improvement of the SGv1-ALOHA mask of around dB also shows how well our framework is able to adapt to this noisy situation, whereas Coherence-VD yields results of unacceptable quality.
\bibliographystyleapndx
IEEEtran \bibliographyapndxbiblio
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] B. Gözcü, R. K. Mahabadi, Y.-H. Li, E. Ilıcak, T. Çukur, J. Scarlett, and V. Cevher, “Learning-based compressive MRI,” IEEE Transactions on Medical Imaging , 2018.
- 2[2] B. Gözcü, T. Sanchez, and V. Cevher, “Rethinking sampling in parallel MRI: A data-driven approach,” in 27th European Signal Processing Conference , 2019.
- 3[3] M. Saeed, T. A. Van, R. Krug, S. W. Hetts, and M. W. Wilson, “Cardiac MR imaging: current status and future direction,” Cardiovascular diagnosis and therapy , vol. 5, no. 4, p. 290, 2015.
- 4[4] E. J. Candes, J. K. Romberg, and T. Tao, “Stable signal recovery from incomplete and inaccurate measurements,” Communications on pure and applied mathematics , vol. 59, no. 8, pp. 1207–1223, 2006.
- 5[5] D. L. Donoho, “Compressed sensing,” IEEE transactions on Information Theory , vol. 52, no. 4, pp. 1289–1306, 2006.
- 6[6] M. Lustig, J. M. Santos, D. L. Donoho, and J. M. Pauly, “ k − t 𝑘 𝑡 k-t SPARSE: High frame rate dynamic MRI exploiting spatio-temporal sparsity,” in Proc. of the 13th Annual Meeting of ISMRM, Seattle , vol. 2420, 2006.
- 7[7] H. Jung, K. Sung, K. S. Nayak, E. Y. Kim, and J. C. Ye, “k-t FOCUSS: A general compressed sensing framework for high resolution dynamic MRI,” Magn. Reson. Med. , vol. 61, no. 1, pp. 103–116, 2009.
- 8[8] R. Otazo, D. Kim, L. Axel, and D. K. Sodickson, “Combination of compressed sensing and parallel imaging for highly accelerated first-pass cardiac perfusion MRI,” Magnetic Resonance in Medicine , vol. 64, no. 3, pp. 767–776, 2010.
