Build-A-FLAIR: Synthetic T2-FLAIR Contrast Generation through Physics   Informed Deep Learning

Andrew S. Nencka; Andrew Klein; Kevin M. Koch; Sean D. McGarry; Peter; S. LaViolette; Eric S. Paulson; Nikolai J. Mickevicius; L. Tugan Muftuler,; Brad Swearingen; Michael A. McCrea

arXiv:1901.04871·physics.med-ph·January 16, 2019

Build-A-FLAIR: Synthetic T2-FLAIR Contrast Generation through Physics Informed Deep Learning

Andrew S. Nencka, Andrew Klein, Kevin M. Koch, Sean D. McGarry, Peter, S. LaViolette, Eric S. Paulson, Nikolai J. Mickevicius, L. Tugan Muftuler,, Brad Swearingen, Michael A. McCrea

PDF

Open Access

TL;DR

This paper presents a physics-informed deep learning approach to generate synthetic T2-FLAIR MRI images from other contrast images, demonstrating high similarity to real images and emphasizing the importance of physically relevant inputs.

Contribution

The study introduces a neural network model that leverages physical relationships between MRI contrasts to accurately synthesize T2-FLAIR images, highlighting the role of feature engineering.

Findings

01

Best model achieved a structural similarity index of 0.909.

02

Synthetic images had lower noise and increased smoothness.

03

Physically relevant inputs significantly improved performance.

Abstract

Purpose: Magnetic resonance imaging (MRI) exams include multiple series with varying contrast and redundant information. For instance, T2-FLAIR contrast is based upon tissue T2 decay and the presence of water, also present in T2- and diffusion-weighted contrasts. T2-FLAIR contrast can be hypothetically modeled through deep learning models trained with diffusion- and T2-weighted acquisitions. Methods: Diffusion-, T2-, T2-FLAIR-, and T1-weighted brain images were acquired in 15 individuals. A convolutional neural network was developed to generate a T2-FLAIR image from other contrasts. Two datasets were withheld from training for validation. Results: Inputs with physical relationships to T2-FLAIR contrast most significantly impacted performance. The best model yielded results similar to acquired T2-FLAIR images, with a structural similarity index of 0.909, and reproduced pathology…

Tables1

Table 1. Table 1: Models evaluated in the Build-A-FLAIR framework. Ten separate models were developed with varying inputs to generate synthetic T 2 -FLAIR contrast. The models included subsets of input contrasts including T 2 -weighted, T 1 -weighted, mean diffusivity (MD), fractional anisotropy (FA), and non-diffusion weighted images (S0). Each row of this table represents one model tested in this work. Contrasts used in each model are shown in white in the line for the model, and contrasts not included as inputs are shown as gray. Models are listed in order of increasing performance, as measured by the structural similarity index between the synthetic T 2 -FLAIR image and the acquired T 2 -FLAIR image in the subject which was fully removed from the training process (V1).

Model	T₂-Weighted	T₁-Weighted	MD	FA	S0	SSIM
1	Included	Omitted	Omitted	Omitted	Omitted	0.69358
2	Omitted	Included	Omitted	Omitted	Omitted	0.76663
3	Omitted	Omitted	Included	Included	Included	0.81578
4	Included	Included	Omitted	Omitted	Omitted	0.81756
5	Omitted	Included	Included	Included	Included	0.87856
6	Omitted	Included	Included	Omitted	Included	0.88312
7	Included	Omitted	Included	Included	Included	0.89064
8	Included	Omitted	Included	Omitted	Included	0.90043
9	Included	Included	Included	Included	Included	0.90584
10	Included	Included	Included	Omitted	Included	0.90881

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMRI in cancer diagnosis · Radiomics and Machine Learning in Medical Imaging · Advanced MRI Techniques and Applications

Full text

\thetitle

Andrew S. Nencka, PhD. 1,2,*, Andrew Klein, MD 1, Kevin M. Koch, PhD. 1,2, Sean D. McGarry 1, Peter S. LaViolette, PhD. 1, Eric S. Paulson, PhD. 3, Nikolai J. Mickevicius, PhD. 3, L. Tugan Muftuler, PhD. 4,2, Brad Swearingen 4, Michael McCrea, PhD.4

1

Department of Radiology, Medical College of Wisconsin, Milwaukee, WI, USA 2. 2

Center for Imaging Research, Medical College of Wisconsin, Milwaukee, WI, USA 3. 3

Department of Radiation Oncology, Medical College of Wisconsin, Milwaukee, WI, USA 4. 4

Department of Neurosurgery, Medical College of Wisconsin, Milwaukee, WI, USA

***** Corresponding author:

Name Andrew S. Nencka

Department Department of Radiology

Center for Imaging Research

Medical College of Wisconsin

Address 8701 Watertown Plank Road

Milwaukee WI 53226

USA

E-mail [email protected]

Manuscript word count: 2795

Abstract word count: 200

Abstract

Purpose: Magnetic resonance imaging (MRI) exams include multiple series with varying contrast and redundant information. For instance, T2-FLAIR contrast is based upon tissue T2 decay and the presence of water, also present in T2- and diffusion-weighted contrasts. T2-FLAIR contrast can be hypothetically modeled through deep learning models trained with diffusion- and T2-weighted acquisitions.

Methods: Diffusion-, T2-, T2-FLAIR-, and T1-weighted brain images were acquired in 15 individuals. A convolutional neural network was developed to generate a T2-FLAIR image from other contrasts. Two datasets were withheld from training for validation.

Results: Inputs with physical relationships to T2-FLAIR contrast most significantly impacted performance. The best model yielded results similar to acquired T2-FLAIR images, with a structural similarity index of 0.909, and reproduced pathology excluded from training. Synthetic images qualitatively exhibited lower noise and increased smoothness compared to acquired images.

Conclusion: This suggests that with optimal inputs, deep learning based contrast generation performs well with creating synthetic T2-FLAIR images. Feature engineering on neural network inputs, based upon the physical basis of contrast, impacts the generation of synthetic contrast images. A larger, prospective clinical study is needed.

Keywords: Machine Learning, MRI, Synthetic Contrast, T2-FLAIR, Style Transfer

1 Introduction

A benefit of magnetic resonance imaging (MRI) is the availability of varying contrasts based upon physical properties of the tissue being imaged [1]. Thus, MRI exams include the acquisition of several different imaging series with varying contrasts [2]. With each series requiring multiple minutes of acquisition time, this leads to long exam times. Long exam times yield increased imaging costs, decreased patient tolerance, and decreased access to MRI. Much work is being done to reduce MRI exam duration to address these challenges, thereby improving the value of MRI.

One way to reduce exam duration is to shorten each acquired series. Series acquisitions can be accelerated by acquiring less data, either reducing anatomical coverage [3; 4; 5; 6] or spatial resolution. Because of Fourier encoding used in MRI, there is a Nyquist sampling criterion which must be met to enable the reconstruction of artifact free images [1]. Parallel imaging, utilizing coil array sensitivities to spatially encode the image, allows model-based reconstruction of images sampled below the Nyquist limit [7; 8; 9]. Sampling beyond the Nyquist limit can additionally be achieved using the mathematics of image compression and iterative reconstruction algorithms through compressed sensing reconstruction [10]. Beyond compressed sensing reconstruction, empirical methods using artificial intelligence and machine learning are also now pushing the boundaries of parallel imaging forward [11].

Exam duration can be further reduced by decreasing the number of acquired series. Multiple thick slice 2D acquisitions with matching contrast and orthogonal planes of acquisition are being replaced by single 3D acquisitions that can be reformatted in arbitrary planes and include higher signal-to-noise ratio to enable further use of parallel imaging [12]. Exam duration can also be reduced by including the reconstruction of multiple contrasts from specialized acquisitions, like synthetic MRI [13; 14] and MR fingerprinting [15].

This work aims to improve MRI value by reducing the number of series acquired in an MRI exam through the synthetic generation of contrasts from other series acquired in the standard of care. The complexity of the underlying biology and physics makes the mapping of a set of MRI contrast weighted images to a different contrast weighted image a difficult analytical problem. In such a case, a complicated non-linear transform could be highly unstable and highly dependent upon system settings including various transmit and receive gains which vary from patient-to-patient. Instead of developing such an analytical solution, we present a deep convolutional neural network which generates an image with new contrast based upon input images of other contrasts.

This study tests two hypotheses. First, we determine whether a convolutional neural network can be developed to yield accurate T2-FLAIR images from other contrasts. Second, we test the impact of input contrast selection on neural network performance, hypothesizing that inclusion of contrasts arising from similar physical properties as the desired output improves model performance.

2 Methods

The data used for this proof of concept implementation were acquired as part of a large-scale study of sports related concussion [16; 17; 18]. This source was selected because it includes many high-resolution 3D image acquisitions across a number of imaging contrasts with well controlled acquisition parameters. A subset of the data acquired in this protocol, including sessions from 15 male collegiate athletes acquired in the fall of 2017 on a GE Healthcare Discovery MR750 running software release DV26R01, were extracted for this study.

The one hour exam in the larger study included T1-weighted MPRAGE, T2-weighted 3D variable flip angle fast spin echo (Cube), T2-FLAIR Cube, multi-echo susceptibility weighted (SWAN), diffusion tensor, arterial spin labeling, and task-free fMRI. Here, an MPRAGE (184 sagittal 1mm3 isotropic resolution , TE 2ms, TR 4700ms, TI 1060ms), T2-weighted Cube (180 sagittal 1mm3 isotropic resolution, TE 93ms, TR 2505ms, ETL 140), T2 FLAIR Cube (180 sagittal 1mm3 isotropic resolution, TE 118ms, TR 6002ms, TI 1600ms, ETL 230), and diffusion tensor imaging (DTI; 47 sagittal 3mm3 isotropic resolution, TE 67ms, TR 5250ms, 30 directions b=1000mm2/s, 30 directions b=2000mm2/s) series were analyzed.

Images were converted from DICOM to NIFTI format using dcm2niix [19]. Diffusion images were processed using a standard pipeline in FSL [20] to yield fractional anisotropy (FA), mean diffusivity (MD), and baseline T2 weighted images (S0). Images within each subject were registered to the MPRAGE volume using FLIRT [21]. Registered NIFTI images were read into Python 3.6 [22] as NumPy arrays [23] using the NIBabel package [24]. Images derived from the diffusion acquisition were resampled with third order splines to 1mm3 resolution using functions in NIBabel.

The Build-A-FLAIR deep neural network was developed using PyTorch [25]. The model was a patch-to-voxel convolutional neural network, modeling each output voxel value through a dense neural network with inputs of 3-dimensional neighborhoods ( $5\times 5\times 5$ voxels) about the voxel from the several input contrast images. The network includes ten densely connected layers, linearly decreasing in number of neurons from the number of input voxels to one across the network. Each artificial neuron was activated with a rectified linear unit (ReLU, [26]). Dropout layers with 50% dropout were added after the second, fourth, sixth, and eighth layers during training to reduce over-fitting during training [27]. A graphical representation of this Build-A-FLAIR network is shown in Figure 1.

Data for thirteen participants were used for network training. One participant, who exhibited an asymptomatic white matter hyperintensity in the frontal lobe, was excluded from the training process and reserved for algorithm validation (referred to as V1 below). Images from this participant are shown in Figure 2, with the white matter hyperintensity shown in the participant’s left anterior frontal lobe. Because a key feature of synthetic contrast generation is the reproduction of pathology, this particular subject was selected for algorithm validation. The other 14 subjects were free from observed pathology. A random subject was also excluded from the training dataset for validation to reduce the probability of over-fitting (referred to as V2 below).

Training was performed on a computer with an Intel i7-3770K processor and an NVIDIA Titan V graphical processing unit. Training included 500 epochs and batch sizes of 100 estimated voxels. For each epoch, images from one subject of the training set were randomly selected and 2,000 voxels to estimate were randomly extracted without resampling from within the brain from that subject. An Adam optimizer was used with a learning rate of 1e-4 [28], and the mean squared error of the estimated voxel values was minimized. Following each epoch, 2,000 voxels from the validation dataset V2 were estimated and the model was saved if the mean squared error on the validation dataset was reduced.

Ten models were tested. These models were designed to evaluate the hypothesis that performance is optimized with input contrasts physically related to the output contrast. The models are described in Table 1. While the network architecture and training procedure were controlled from model-to-model, the number of inputs, and thus number of neurons in each layer, varied with each model. For each input contrast, 125 voxels corresponding to the $5\times 5\times 5$ voxel neighborhood around the voxel to be generated were included in the input. Thus, the first layer included 125 neurons for a network including only the T2-weighted acquisition for an input while that layer included 250 neurons for a model including both T2-weighted and T1-weighted inputs.

Following training, fit models were applied to the V1 validation dataset to generate a full synthetic T2-FLAIR volume. The structural similarity index (SSIM) [29] was calculated over the brain between the synthetic and acquired T2-FLAIR image volumes.

3 Results

Performances of the models are shown in Table 1. Models with single contrast inputs (T2-, $T_{1}$ -, or diffusion-weighted) were among the worst performing models, with the three lowest SSIM metrics. Models built upon only T2-weighted and T1 weighted anatomical acquisitions performed nearly as poorly as the model built with only diffusion data as an input.

Models including both high resolution anatomical images and diffusion imaging metrics yielded superior performance. If only one high resolution anatomical imaging dataset was included in the model, performance was better if it was T2-weighted. With diffusion metrics included, models with inputs of both T1-weighted and T2-weighted images yielded marginally improved results over a model without $T_{1}$ -weighted images. In all cases, including FA yielded poorer performance compared to equivalent models without FA.

Synthetic FLAIR images resulting from the best performing Build-A-FLAIR network applied to validation subject V1 are shown in Figure 3, with acquired images in panels (a-c) and synthetic images shown in Figure panels (d-f). The aforementioned hyperintensity is visible on all cross sections in the acquired images, and was reproduced in the synthetic T2-FLAIR image derived from the Build-A-FLAIR network. The synthetic images qualitatively exhibit more smoothness and less noise than the acquired T2-FLAIR images.

4 Discussion

This study demonstrates the utility of a convolutional neural network for generating synthetic T2-FLAIR images from conventionally acquired images of different contrast. The model reproduces pathology not present in training data. Importantly, optimal network performance results from the inclusion of physically relevant contrasts relating to T2-FLAIR contrast. In fact, inclusion of images in the training that were not directly physically relevant to T2-FLAIR contrast were detrimental to model performance. While models employing deep learning are widely considered to be “black boxes,” these results shed some light into the underlying mechanisms of such models.

Using deep learning to transform image contrast is not new. In the field of artificial intelligence, there has been great progress in developing techniques for style transfers [30]. With style transfers, the characteristics of the desired output style are learned from an image with the desired output style and the characteristics of the output content are learned from an image with the desired output content. The resulting convolutional neural networks are merged to yield a network which generates an image with the style of one input image applied to the content of a second input image.

It is clear that style transfer networks are related to the presented work in only the most general way, as both yield outputs with different contrast and same gross structure of input images. It is conceivable that the T2-FLAIR style could be learned from a set of T2-FLAIR images from a set of patients and the gross anatomical structure content learned from an anatomical image with different, say T1-weighted, contrast from the patient of interest. Importantly, the texture of the output image is based upon the texture profile of the image used to train the style portion of the network. As the clinical interpretation of images is often related to the texture of the diagnostic image, this may be suboptimal.

The Build-A-FLAIR network, conversely, approaches the problem of a style transfer in MRI as a non-linear regression model that generates a new contrast image based upon the relative intensities of a set of multi-contrast input images. Thus, while style transfer networks require unique training with each desired content, the Build-A-FLAIR network is trained on one set of multi-contrast images and the model is applied to novel images in the synthesis process. This makes the Build-A-FLAIR network dependent upon consistency of input image contrast between the training dataset and the subsequent images on which the neural network is used for inference. Additionally, while style transfer methods work on the scale of a full output image at once, the Build-A-FLAIR network performs inference on an output voxel-by-voxel basis. In doing so, output texture is based only upon the local neighborhoods of a voxel in the input multi-contrast images.

Models including FA as inputs performed worse than models not including FA. While an ideal machine learning algorithm should theoretically yield, at worst, matching results to a model with a subset of inputs as the larger model by giving zero weights to the additional input data, it is apparent that the implemented algorithm reaches a local minimum with non-zero weights. T2-FLAIR contrast, being dependent upon free water and tissue T2 relaxation rate, should not include dependence upon the aniostropy of water diffusion. This result is consistent with the hypothesis that the inclusion of input contrasts with similar physical mechanisms of contrast as the desired output is optimal.

Qualitatively, the generated T2-FLAIR images exhibit less noise and more smoothness than the acquired T2-FLAIR images. The process of regressing the intensity of each voxel as a function of a series of filters applied to the neighborhood of the voxel is likely responsible for this result. The denoising characteristic of this method is similar to non-local means denoising [31] wherein an output voxel is modeled as a weighted combination of regions with similar structure. With a deep convolutional neural network, the fit filters function as the library of similar structures used in non-local means. This neighborhood dependence, as well, is likely responsible for the perceived increase in spatial smoothness.

Recent work has included the generation of synthetic contrast images from individual image contrasts using generalized adversarial networks (SUSAN, [32]). In that work, contrast changes were meant for data augmentation in machine learning for image segmentation, rather than for the elimination of a given series in a clinical exam. With this different goal, SUSAN is less dependent upon individual image outputs, making compromises from the implementation using a single image contrast as input less detrimental.

A weakness in the development of deep convolutional neural networks is the need for a large volume of training data and significant computational resources for full image training and synthesis. The Build-A-FLAIR network was generated as a local convolutional neural network to perform a regression of a single output voxel value as a function of the neighborhood of voxels in the input multi-contrast images to address both of these challenges. By developing the model as a voxel-wise regression, the hundreds of thousands of voxels in each exam could be used as unique training and validation sets. Further, by modeling input patches rather than full images, the training step can hold more data sets in memory. This allows larger batches in training and improved training convergence, especially when finite memory resources are available [33]. The inference problem for each voxel is distinct, allowing the GPU architecture to rapidly perform image synthesis.

With machine learning in medical imaging, the burden of proof to indicate true success that impacts the clinical workflow is high. A concern with contrast generated from artificial intelligence is that the trained model may not represent a pathology which was excluded from training. In this work, a case study of one dataset where a pathology was identified a priori is shown wherein the network reproduced the T2-FLAIR white matter hyperintensity even though it was not included in training. This is necessary but not sufficient for translational acceptance. For this method to be implemented in the clinic, a much larger scale prospective study must be performed with a proper blinded radiologist reader scoring. Such work is ongoing.

While the work herein shows the example of generating synthetic T2-FLAIR images from T2-weighted and diffusion-weighted images, the described network is generalizable. As an example, the network was trained to generate a T2-weighted image from input T2-FLAIR, T1-weighted, and diffusion-weighted images. The result of this model implemented on a dataset not included in the training cohort is shown in Figure 4. While the input diffusion-weighted images include a low resolution T2-weighted image, the output synthetic T2-weighted image retains reasonable spatial resolution. As with the generation of T2-FLAIR images, the inclusion of FA as an input does not improve network performance, and the inclusion of a T1-weighted and T2-FLAIR acquisition as input does not drastically improve performance compared to the inclusion of only the T2-FLAIR acquisition. It is expected that other contrasts could be used for training to yield yet other physically related output contrasts. For instance, it is likely that individual echo images from a multiple gradient echo acquisition (like susceptibility weighted imaging) and a T2-weighted spin echo acquisition could yield synthetic T2-weighted Dixon [34] fat and water images. Such extensions remain as future potential continuations of this work.

5 Conclusions

We demonstrate that T2-FLAIR images can be generated from other standard neuroimaging contrasts. We further showed that optimal performance was be achieved by the inclusion of contrasts physically related to T2-FLAIR signal in the training dataset. Inputs of T2-weighted and mean diffusivity maps most significantly impacted synthetic T2-FLAIR contrast generation because T2-FLAIR contrast is physically related to both T2 tissue decay rates and the presence of free water.

The presented results are a first step toward the consideration of this synthetically generated contrast to be used to improve MRI value. With T2-FLAIR acquisitions exhibiting reduced signal-to-noise ratio and, thus, requiring increased scan duration, the synthetic generation of T2-FLAIR images from other contrasts could improve MRI value by eliminating the need for a long series in an imaging session. While this proof of concept development is promising, further utility of the Build-A-FLAIR model to replace clinically acquired series requires further, large-scale validation.

6 Acknowledgments

This work was supported by the Defense Health Program under the Department of Defense Broad Agency Announcement for Extramural Medical Research through Award No. W81XWH-14-1-0561. Opinions, interpretations, conclusions and recommendations are those of the author and are not necessarily endorsed by the Department of Defense (DHP funds).

7 Figures and Tables

Bibliography34

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Bernstein M, King K, Zhou X. Handbook of MRI Pulse Sequences. Elsevier Science, 2004.
2[2] Bushberg J. The Essential Physics of Medical Imaging. Lippincott Williams & Wilkins, 2002.
3[3] Feinberg DA, Hoenninger J, Crooks L, Kaufman L, Watts J, Arakawa M. Inner Volume MR Imaging: Technical Concepts and Their Application. Radiology 1985;1563:743–747.
4[4] Conturo TE, Price RR, Beth AH. Rapid Local Rectangular Views and Magnifications: Reduced Phase Encoding of Orthogonally Excited Spin Echoes. Magnetic Resonance in Medicine 1988;64:418–429.
5[5] Wheeler-Kingshott CA, Parker GJ, Symms MR, Hickman SJ, Tofts PS, Miller DH, Barker GJ. ADC Mapping of the Human Optic Nerve: Increased Resolution, Coverage, and Reliability with CSF-suppressed ZOOM-EPI. Magnetic Resonance in Medicine 2002;471:24–31.
6[6] Pauly J, Spielman D, Macovski A. Echo-Planar Spin-Echo and Inversion Pulses. Magnetic Resonance in Medicine 1993;296:776–782.
7[7] Pruessmann KP, Weiger M, Scheidegger MB, Boesiger P. SENSE: Sensitivity Encoding for Fast MRI. Magnetic Resonance in Medicine 1999;425:952–962.
8[8] Griswold MA, Jakob PM, Heidemann RM, Nittka M, Jellus V, Wang J, Kiefer B, Haase A. Generalized Autocalibrating Partially Parallel Acquisitions (GRAPPA). Magnetic Resonance in Medicine 2002;476:1202–1210.