DeepTract: A Probabilistic Deep Learning Framework for White Matter   Fiber Tractography

Itay Benou; Tammy Riklin-Raviv

arXiv:1812.05129·cs.CV·October 18, 2019

DeepTract: A Probabilistic Deep Learning Framework for White Matter Fiber Tractography

Itay Benou, Tammy Riklin-Raviv

PDF

1 Repo

TL;DR

DeepTract introduces a deep learning framework that estimates white matter fiber orientations from diffusion weighted images without relying on predefined diffusion models, enabling both deterministic and probabilistic tractography.

Contribution

It presents a novel recurrent neural network-based approach for fiber reconstruction that is data-driven and model-agnostic, outperforming traditional methods in certain evaluations.

Findings

01

Competitive performance with state-of-the-art algorithms

02

Effective probabilistic and deterministic tractography

03

Qualitative bundle-specific tractography results

Abstract

We present DeepTract, a deep-learning framework for estimating white matter fibers orientation and streamline tractography. We adopt a data-driven approach for fiber reconstruction from diffusion weighted images (DWI), which does not assume a specific diffusion model. We use a recurrent neural network for mapping sequences of DWI values into probabilistic fiber orientation distributions. Based on these estimations, our model facilitates both deterministic and probabilistic streamline tractography. We quantitatively evaluate our method using the Tractometer tool, demonstrating competitive performance with state-of-the art classical and machine learning based tractography algorithms. We further present qualitative results of bundle-specific probabilistic tractography obtained using our method. The code is publicly available at: https://github.com/itaybenou/DeepTract.git.

Tables1

Table 1. Table 1: Tractometer evaluation results. Up-arrow represents higher-is-better metrics, while down arrow represents lower-is-better ones. Best scores are in bold font.

Model	Connections (%)			Bundles		Coverage (%)
Model	VC $↑$	IC $↓$	NC $↓$	VB $↑$	IB $↓$	OL $↑$	OR $↓$	F1 $↑$
ISMRM mean results	53.6	19.7	25.2	21.4	281	31.0	23.0	44.2
Poulin et al. [21]	41.6	45.6	12.8	23	130	64.4	35.4	64.5
Wegmayr et al. [23]	72	-	-	23	57	16.0	28.0	-
MITK (supervisor)	59.1	27.8	13.1	24	69	47.2	31.2	52.5
Proposed (GT supervision)	70.6	19.5	9.9	25	56	69.3	22.7	70.1
Proposed (MITK supervision)	40.5	32.6	22.9	23	51	34.4	17.3	44.2

Equations6

fODF_{p_{j}} (d) = E_{S_{i}} [CfODF_{(p_{j} ∣ S_{i})} (d)] = i \sum p r o b (S_{i}) CfODF_{(p_{j} ∣ S_{i})} (d)

fODF_{p_{j}} (d) = E_{S_{i}} [CfODF_{(p_{j} ∣ S_{i})} (d)] = i \sum p r o b (S_{i}) CfODF_{(p_{j} ∣ S_{i})} (d)

L_{i} = - \frac{1}{n _{i}} j [= 1] n_{i} \sum m [= 1] M + 1 \sum y_{p_{j}} (d_{m}) log (C f O D F_{(p_{j} ∣ S_{i})} (d_{m}))

L_{i} = - \frac{1}{n _{i}} j [= 1] n_{i} \sum m [= 1] M + 1 \sum y_{p_{j}} (d_{m}) log (C f O D F_{(p_{j} ∣ S_{i})} (d_{m}))

y_{s m oo t h} (d) = δ (d - d_{l ab e l}) * G (d) = \frac{1}{Z} exp (- \frac{∡ ( d , d _{l ab e l} )}{τ})

y_{s m oo t h} (d) = δ (d - d_{l ab e l}) * G (d) = \frac{1}{Z} exp (- \frac{∡ ( d , d _{l ab e l} )}{τ})

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

itaybenou/DeepTract
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

11institutetext: Department of Electrical and Computer Engineering,

Ben-Gurion University of the Negev, Beer-Sheva, Israel

22institutetext: The Zlotowski Center for Neuroscience,

Ben-Gurion University of the Negev, Beer-Sheva, Israel

DeepTract: A Probabilistic Deep Learning Framework for White Matter Fiber Tractography

Itay Benou

Tammy Riklin Raviv 1122

Abstract

We present DeepTract, a deep-learning framework for estimating white matter fibers orientation and streamline tractography. We adopt a data-driven approach for fiber reconstruction from diffusion-weighted images (DWI), which does not assume a specific diffusion model. We use a recurrent neural network for mapping sequences of DWI values into probabilistic fiber orientation distributions. Based on these estimations, our model facilitates both deterministic and probabilistic streamline tractography. We quantitatively evaluate our method using the Tractometer tool, demonstrating competitive performance with state-of-the-art classical and machine learning based tractography algorithms. We further present qualitative results of bundle-specific probabilistic tractography obtained using our method. The code is publicly available at: https://github.com/itaybenou/DeepTract.git.

1 Introduction

Tractography based on diffusion MRI (dMRI) is an important tool in the study of white matter (WM) structures in the brain, allowing to visualize and analyse complex neural tracts in brain connectivity studies [7] and investigation of neurological disorders [4, 5, 9]. Standard tractography pipelines usually consist of a diffusion modeling stage, in which local fiber orientations are estimated from diffusion weighted images (DWI), followed by a tracking stage in which these orientations are translated into WM streamlines. At the heart of the modeling stage lays the problem of finding the local configuration of WM fibers that gave rise to the measured DWI signal. Since a single brain voxel can contain tens of thousands of differently oriented fibers, accurate reconstruction of fiber orientations is a very challenging task.

Current tractography algorithms can be roughly divided into deterministic approaches, which provide a single streamline orientation in each voxel [2, 17], probabilistic [6], or global [12]. Nevertheless, all of these methods are based on specific, pre-defined, mathematical models for mapping dMRI signals into fiber orientation estimates. Among others, these models include the diffusion tensor model [3], Q-ball imaging [11] and spherical deconvolution [22]. Despite remarkable progress made, current model-based methods are not without limitations [14]. Each such model makes specific assumptions regarding WM tissue properties and the dMRI signal, which may vary substantially depending on the subject and the data acquisition process [20]. Some models also impose specific requirements on the data quality and acquisition protocol, e.g., a large number of gradient directions. Therefore, from the user’s point of view, choosing a suitable model may prove to be a non-trivial task that requires a high level of expertise.

Machine learning (ML) and deep learning (DL) techniques have demonstrated remarkable abilities in tackling complex problems in a data-driven rather than model-based manner, in a wide variety of domains. Recently, such approaches have been applied to the task of WM tractography, aiming to directly learn the mapping between input DWI scans and output WM tractography streamlines. By not assuming a specific diffusion model, data-driven algorithms can reduce the dependence on data acquisition schemes and require less user intervention. Neher et al. [20] pioneered this line of work, proposing a supervised ML tractography algorithm based on random forest (RF) classifier. The RF classifier was trained to predict a local fiber orientation from a discrete set of possible directions, based on the surrounding dMRI values. More recently, [21] suggested a DL model for fiber tractography, examining a fully-connected (FC) and a recurrent neural network (RNN) architectures. In contrast to [20], streamline tractography was addressed as a regression problem by predicting continuous tracking directions based on sequences of dMRI values. A similar regression approach was presented in [23] using a multi-layer perceptron (MLP) network. We note that all of these methods perform deterministic tractography, outputting a single streamline direction in each tracking step. Other DL works have focused strictly on fiber orientation estimation. For example [16] presented a deep convolutional neural network for estimating discrete fiber orientation distribution functions (fODFs) from dMRI scans. A variation of this idea was presented in [15], predicting spherical harmonics coefficients for continuous fODF estimation. These works, however, do not perform fiber tractography.

In this work we present DeepTract, a novel DL framework addressing both fiber orientation estimation and streamline tractography from DWI scans. To exploit the sequential nature of tractography data, we address the problem as a sequential classification task by training an RNN model to predict local fiber orientations (i.e., classes) along tractography streamlines. Unlike other DL-based tractography algorithms, our model does not output a single deterministic fiber orientation in each tracking step. Instead, it provides a probabilistic estimation of the local fiber orientation distribution in the form of a discrete probability density function. This enables our model to perform deterministic streamline tracking, as well as probabilistic tractography by randomly sampling directions from the estimated distributions. We quantitatively evaluate our method using the Tractometer tool [10], demonstrating improved or competitive results compared to state-of-the-art tractography algorithms. We further present qualitative results of high-quality probabilistic tractograms generated by our method.

2 Methods

In the following sections we describe the proposed DeepTract framework: (1) the input model, (2) how the network learns to predict fiber orientations from DWI data, (3) how new tractograms are generated from unseen data, and (4) the implementation details of the neural network’s architecture.

2.1 Input Model

The training data consists of two sets: a DWI set $\displaystyle\mathcal{D}$ and its corresponding whole-brain tractography $\displaystyle\mathbf{T}=\left\{S_{i}\right\}_{i=1}^{N}$ with $\displaystyle N$ streamlines. Each streamline is represented by a sequence of equi-distant 3D coordinates, i.e., $\displaystyle S_{i}=\left\{p_{j}\right\}_{j=1}^{n_{i}}$ .

Pre-Processing: To handle datasets acquired with different gradient schemes, we first resample the DWI set into $\displaystyle\mathit{K}$ pre-defined gradient directions evenly distributed on the unit hemisphere, using spherical harmonics (we use $\displaystyle K$ =100). Each DWI volume is then centred according to its mean and normalized by the $\displaystyle b_{0}$ (non-diffusion weighted) volume.

Sequential Input Model: To make our model invariant to spatial transformations, we feed it with sequences of DWI values instead of directly using the 3D coordinates of the streamlines. Formally, given a DWI dataset $\displaystyle\mathcal{D}$ and a streamline $\displaystyle S_{i}$ , the input to our model is the series of DWI vectors measured along the streamline, i.e. $\displaystyle\left\{\mathbf{D}(p_{1}),...,\mathbf{D}(p_{n_{i}})\right\}$ . Each input entry $\displaystyle\mathbf{D}(p_{j})$ is a vector of $\displaystyle K$ DWI values measured at location $\displaystyle p_{j}$ , such that a single streamline of length $\displaystyle n_{i}$ corresponds to an input tensor of size $\displaystyle n_{i}\times K$ .

2.2 Fiber Orientation Estimation

Aiming to facilitate both deterministic and probabilistic tractography, we require our model to provide a probabilistic estimation of local fiber orientations prior to tracking. For this purpose, we use a discrete representation of an fODF by sampling the unit sphere at $\displaystyle M$ evenly-distributed points $\displaystyle\mathbf{d}=\left\{d_{m}\right\}_{m=1}^{M}$ , each representing a possible fiber orientation. We, therefore, address the problem as a classification task where each orientation is considered as a separate “class”. An additional “end-of-fiber” (EoF) class is used for labeling streamline termination points. Given an input streamline, our model predicts a probability density function $\displaystyle P(d)$ of $\displaystyle M$$\displaystyle+$ 1 class probabilities at each point along the streamline. We note that this formulation poses a tradeoff between higher angular resolution, achieved by increasing the number of classes, and the complexity of the classification problem. We used $\displaystyle M$ =724, providing an angular resolution of $\displaystyle\sim$ 3.5*∘*.

Conditional fODFs: Standard fODFs represent the total orientation distribution function at a voxel location $\displaystyle p_{j}$ , independent of other voxels, i.e. fODF ${}_{p_{j}}(d)$ = $\displaystyle P(d)$ . Being a sequence-based model, DeepTract has the advantage of utilizing the “history” of DWI values along an input streamline $\displaystyle S_{i}=\left\{p_{1},...,p_{n_{i}}\right\}$ . Accordingly, it yields a conditional estimation of the fODF at location $\displaystyle p_{j}\in S_{i}$ , i.e. CfODF ${}_{\left(p_{j}\mid S_{i}\right)}(d)=P\left(d\mid\mathbf{D}\left(p_{j}\right),\mathbf{D}\left(p_{j-1}\right),...,\mathbf{D}\left(p_{1}\right)\right)$ . We note that a direct relation between CfODFs and total (standard) fODFs can be obtained. Let $\displaystyle prob\left(S_{i}\right)$ denote the probability of reaching the point $\displaystyle p_{j}$ via path $\displaystyle S_{i}$ , out of all streamlines passing through $\displaystyle p_{j}$ . Using the total probability theorem, we have:

[TABLE]

2.3 Streamline Tractography

During training, the predicted CfODF at location $\displaystyle p_{j}$ is compared to the “true” orientation defined by $\displaystyle d_{label}(p_{j})=(p_{j+1}-p_{j})/\left\|p_{j+1}-p_{j}\right\|$ (see Fig. 1(a)). The corresponding class label $\displaystyle y_{p_{j}}(d)$ is the orientation $\displaystyle d_{m}\in\mathbf{d}$ which is closest to $\displaystyle d_{label}(p_{j})$ , represented by a “1-hot” vector. Once the model is trained, streamline tractography can be performed on unseen DWI scans in an iterative process as illustrated in Fig. 1(b). Given an initial seed point $\displaystyle\hat{p_{1}}$ , the corresponding DWI vector $\displaystyle\mathbf{D}(\hat{p_{1}})$ is fed into the network, which in turn provides CfODF( $\displaystyle\hat{p_{1}}$ ) as output. Deterministic tracking is performed by stepping in the most likely fiber orientation, i.e. $\displaystyle\hat{d}(\hat{p_{1}})=\arg\underset{d\in\mathbf{d}}{\max}\mathit{CfODF}(\hat{p_{1}})$ . Alternatively, probabilistic tracking can be performed by randomly sampling a direction from the CfODF. Either way, the streamline is propagated iteratively according to $\displaystyle\hat{p}_{j+1}=\hat{p}_{j}+\alpha\hat{d}(p_{j})$ , where $\displaystyle\alpha$ is the step size. The process is repeated until the EoF class is predicted.

2.4 Network Architecture and Loss Function

We implement our model using an RNN, specifically a Gated Recurrent Unit (GRU) [8]. The proposed network consists of five stacked GRU layers, each containing 1000 units. We use ReLU activations for all layers but the last one, which is a fully-connected (FC) layer followed by a softmax operation. The loss function for a single input streamline $\displaystyle S_{i}$ is the mean cross-entropy between the predicted CfODFs and the true labels along the streamline:

[TABLE]

2.4.1 Label Smoothing:

Unlike traditional classification tasks, here the classes are geometrically structured with a well-defined angular metric. This implies that different classifications errors should be weighted according to this metric, thus a 1-hot label (i.e., a delta function) is not suitable. To account for the spatial structure of the classes, we propose to smooth the true labels by convolving them with a Gaussian kernel $\displaystyle G$ of width $\displaystyle\tau$ on the unit sphere:

[TABLE]

where $\displaystyle{\measuredangle\left(d,d_{label}\right)}$ is the angle between a direction $\displaystyle d$ and the ground truth direction $\displaystyle d_{label}$ , and $\displaystyle Z$ is a normalization constant (see Fig. 2).

2.4.2 Entropy-Based Tracking Termination:

During the generative process of RNNs at test time, accumulated error may divert predictions from the training data distribution [13], towards “unfamiliar” input values. This may result in an increased uncertainty of our model’s predictions, manifesting as more isotropic CfODF estimations and erroneous tracking steps. To alleviate this problem, we introduce an entropy-based tracking termination criterion, in addition to the EoF class. Specifically, we terminate the tracking process whenever the entropy of a predicted CfODF exceeds the following dynamic threshold $\displaystyle E_{\mbox{\scriptsize}{th}}\left(t\right)=a\exp\left(-t/b\right)+c$ , where $\displaystyle t$ is the sequence time step, and $\displaystyle a$ , $\displaystyle b$ and $\displaystyle c$ are hyperparameters.

3 Experiments and Results

To test the performance of the proposed method we used the ISMRM tractography challenge DWI phantom dataset [18]. Our experiments include: 1) Quantitative evaluation of whole brain tractography using the Tractometer tool. 2) Qualitative (visual) demonstration of bundle-specific probabilistic tractography performed by our model.

Pre-Processing: DWIs were denoised [19] and corrected for eddy currents and head motion. For supervision, whole brain tractography was performed using Q-ball reconstruction [1] followed by probabilistic tracking using the MITK diffusion tool. The resulting streamlines were divided into training and validation sets using a 90%-10% split. Data augmentation was performed by reversing the orientation of all streamlines in the training set, resulting in $\displaystyle\sim$ 400K streamlines.

Training Procedure: Training was performed using the Adam optimizer with a batch size of 32 streamlines. To avoid overfitting, dropout was used with deletion probability of 0.3, as well as gradient clipping to avoid exploding gradients.

3.1 Whole Brain Tractography Evaluation

We first evaluate our method using Tractometer - a publicly available online tool for assessment of whole brain tractography algorithms.

Tractography: We used our trained model to perform whole brain deterministic tractography. Tracking was initialized by randomly placing 200K seed points within the DWI volume. Using the validation set, the step-size was set to $\displaystyle\alpha=0.5$ (in voxels), and the entropy-based stopping parameters were set to $\displaystyle a$ =3, $\displaystyle b$ =10 and $\displaystyle c$ =4.5. Tracking was terminated for high-curvature steps (over 60*∘*), and output streamlines longer than 200mm and shorter than 20mm were discarded.

Evaluation: Tractometer evaluates a whole brain tractogram by comparing it to 25 gold standard streamline bundles, outputting the following scores: three measures for the correctness of streamline connectivity, i.e. percentage of valid connections (VC), invalid connections (VC) and non-connections (NC); two measures for the correctness of identified bundles, i.e. number of valid and invalid bundles (VB and IB); and three scores for the correctness of voxel coverage, i.e. overlap (OL), overreach (OR) and $\displaystyle F_{1}$ scores. Please refer to [10] for more details.

Results: The whole brain tractography generated by the proposed DeepTract method is shown in Fig. 3, along with those of MITK and the ISMRM challenge gold standard. The Tractometer scores of our method, when supervised by MITK and by the gold-standard tractography, are summarized in Table 1. We compare these results to the performance of MITK (supervisor), the average ISMRM challenge submission, and two other DL tractography methods [21, 23]. The results demonstrate that DeepTract is competitive or outperforms the examined methods in most parameters. Specifically, when using the gold standard tractography for supervision, DeepTract demonstrated the best voxel coverage performance, i.e. highest OL and F1 scores. In addition, DeepTract scored the best false-positive connections rates (lowest IC and NC), and was the only method to successfully detect all 25 bundles – clearly demonstrating the high-limit capability of the proposed method. We note that our VC score is slightly lower than [23], however [23] did not report their IC, NC and F1 score, so a complete comparison to their work is not possible. We further note that even when using the MITK tractography for supervision, our method demonstrated good overall performance, including the best false-positive rates (lowest OR and IB). This result is probably due to our entropy-based termination criterion, which prevents streamlines from straying off coherent bundle structures.

3.2 Probabilistic Tracking

We further demonstrate DeepTract’s ability to perform probabilistic tractography, using the phantom DWI dataset of the ISMRM challenge. Bundle-specific tracking was performed by seeding from the endpoints of a gold-standard bundle, and then sampling iteratively from the predicted CfODFs. The process was repeated $\displaystyle T=20$ times to create a probabilistic map counting the number of “visits” in each voxel. Results for the Frontopontine tract and the Uncinate Fasciculus are shown in Fig. 4, alongside the ground truth bundles. Visual evaluation shows that the resulting bundles are in-line with the ground truth tractograms. Also note that higher probabilities were assigned to the bundles’ core, gradually decreasing as fibers diverge towards their endpoints due to higher uncertainty.

4 Summary and Discussion

We presented DeepTract, the first deep learning method capable of performing both deterministic and probabilistic tractography from DWI data. We showed that by combining an RNN-based sequential approach with a discrete classification framework, our model provides reliable probabilistic fiber orientation estimations. In a quantitative evaluation, the proposed method outperformed or was competitive with state-of-the-art classical and DL tractography algorithms. While larger dMRI and tractography datasets are needed to further progress the research of data-driven tractography, the results obtained in this work demonstrate the potential of DL methods for WM tractography applications.

Bibliography23

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] I. Aganj, C. Lenglet, and G. Sapiro. ODF reconstruction in Q-ball imaging with solid angle consideration. In ISBI , pages 1398–1401, 2009.
2[2] P. Basser. Fiber-tractography via diffusion tensor MRI (DT-MRI). In Proceedings of the 6th Annual Meeting ISMRM , 1998.
3[3] P. Basser, J. Mattiello, and D. Le Bihan. Estimation of the effective self-diffusion tensor from the NMR spin echo. JMR , pages 247–254, 1994.
4[4] I. Benou, R. Veksler, A. Friedman, and T. Riklin-Raviv. Fiber-flux diffusion density for white matter tracts analysis: Application to mild anomalies localization. In CDMRI: MICCAI Workshop , page 191, 2018.
5[5] I. Benou, R. Veksler, A. Friedman, and T. Riklin-Raviv. Combining white matter diffusion and geometry for tract-specific alignment and variability analysis. Neuro Image , 2019.
6[6] J. Berman, S. Chung, P. Mukherjee, et al. Probabilistic streamline Q-ball tractography using the residual bootstrap. Neuroimage , pages 215–222, 2008.
7[7] E. Bullmore and O. Sporns. Complex brain networks: graph theoretical analysis of structural and functional systems. Nature Reviews Neuroscience , page 186, 2009.
8[8] J. Chung, C. Gulcehre, K. Cho, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling. ar Xiv preprint ar Xiv:1412.3555 , 2014.