ABCD Neurocognitive Prediction Challenge 2019: Predicting individual   residual fluid intelligence scores from cortical grey matter morphology

Neil P. Oxtoby; Fabio S. Ferreira; Agoston Mihalik; Tong Wu; Mikael; Brudfors; Hongxiang Lin; Anita Rau; Stefano B. Blumberg; Maria Robu; Cemre; Zor; Maira Tariq; Maria Del Mar Estarellas Garcia; Baris Kanber; Daniil I.; Nikitichev; Janaina Mourao-Miranda

arXiv:1905.10834·q-bio.NC·May 28, 2019·MICCAI

ABCD Neurocognitive Prediction Challenge 2019: Predicting individual residual fluid intelligence scores from cortical grey matter morphology

Neil P. Oxtoby, Fabio S. Ferreira, Agoston Mihalik, Tong Wu, Mikael, Brudfors, Hongxiang Lin, Anita Rau, Stefano B. Blumberg, Maria Robu, Cemre, Zor, Maira Tariq, Maria Del Mar Estarellas Garcia, Baris Kanber, Daniil I., Nikitichev, Janaina Mourao-Miranda

PDF

TL;DR

This study attempted to predict residual fluid intelligence scores from cortical grey matter morphology using graph-theory metrics derived from MRI data, but found limited predictive power, indicating these features offer little insight into intelligence variation.

Contribution

The paper introduces a method using morphological similarity and graph-theory metrics from MRI data to predict residual fluid intelligence, highlighting the limited predictive value of these features.

Findings

01

Minimal improvement over baseline prediction

02

Structural covariance networks provide little information about residual fluid intelligence

03

Support vector regression trained on MRI features showed limited success

Abstract

We predicted residual fluid intelligence scores from T1-weighted MRI data available as part of the ABCD NP Challenge 2019, using morphological similarity of grey-matter regions across the cortex. Individual structural covariance networks (SCN) were abstracted into graph-theory metrics averaged over nodes across the brain and in data-driven communities/modules. Metrics included degree, path length, clustering coefficient, centrality, rich club coefficient, and small-worldness. These features derived from the training set were used to build various regression models for predicting residual fluid intelligence scores, with performance evaluated both using cross-validation within the training set and using the held-out validation set. Our predictions on the test set were generated with a support vector regression model trained on the training set. We found minimal improvement over predicting…

Tables2

Table 1. Table 1: Descriptive values for 26 SCN graph-theory features across training, validation, and test sets. Values are: mean (std). Missing data was due to feature generation failure (see Methods): training set 96% complete; validation 94%; test 92%. Notes: Centrality = Betweenness centrality; Clustering = Clustering coefficient.

Community 1/2/3 features
Whole Network Features	Training	Validation	Test
Whole Network Features	(N=3579 of 3739)	(N=390 of 415)	(N=4156 of 4515)
Small-world	1.68 (0.03)	1.68 (0.02)	1.68 (0.02)
\addstackgap[.5]0 Rich Club – median	0.29 (0.01)	0.29 (0.01)	0.29 (0.01)
– mad	0.11 (0.03)	0.11 (0.01)	0.11 (0.01)
\addstackgap[.5]0 Path Length – median	2.48 (0.03)	2.48 (0.01)	2.48 (0.01)
– std	1.15 (0.03)	1.15 (0.02)	1.16 (0.02)
\addstackgap[.5]0 Degree – median	1050 (45)	1052 (45)	1053 (40)
– mad	295 (19)	294 (15)	294 (15)
\addstackgap[.5]0 Centrality – median	6584 (171)	6590 (150)	6578 (152)
– mad	5153 (157)	5157 (118)	5158 (117)
\addstackgap[.5]0 Clustering – median	0.53 (0.01)	0.53 (0.01)	0.53 (0.01)
– mad	0.063 (0.005)	0.063 (0.005)	0.063 (0.005)
\addstackgap[.5]0 Avg. Degree – 1	995 (233)	1004 (232)	995 (239)
\addstackgap[.5]0 – 2	996 (240)	997 (235)	999 (239)
\addstackgap[.5]0 – 3	1019 (242)	1004 (243)	1014 (238)
Avg. degree z-score (all)	0.2 (0.1)	0.2 (0.1)	0.2 (0.1)
Avg. path length (all)	1.5 (0.1)	1.5 (0.1)	1.5 (0.1)
\addstackgap[.5]0 Centrality – 1	6020 (3750)	6180 (3800)	6100 (3780)
\addstackgap[.5]0 – 2	6290 (3790)	6030 (3740)	6290 (3770)
\addstackgap[.5]0 – 3	6660 (3830)	6550 (3760)	6520 (3810)
\addstackgap[.5]0 Clustering – 1	0.53 (0.06)	0.53 (0.06)	0.52 (0.06)
\addstackgap[.5]0 – 2	0.53 (0.06)	0.53 (0.06)	0.53 (0.06)
\addstackgap[.5]0 – 3	0.53 (0.06)	0.53 (0.06)	0.52 (0.06)

Table 2. Table 2: Mean-squared error (MSE) and correlation for the predictive models. For reference, the variance of the training set was 85.85 85.85 85.85 and the validation set was 71.53 71.53 71.53 .

Prediction method	Training set		Validation set		Test set
Prediction method	MSE	Correlation	MSE	Correlation	MSE
SVR	85.82	0.02	71.19	0.01	93.8335
EBM+KRR	85.46	0.001	71.58	0.003	N/A

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

11institutetext: Centre for Medical Image Computing (CMIC),

Department of Computer Science &

Department of Medical Physics and Biomedical Engineering, 22institutetext: Max Planck UCL Centre for Computational Psychiatry and Ageing Research, 33institutetext: The Wellcome Centre for Human Neuroimaging 44institutetext: Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), 55institutetext: Department of Clinical and Experimental Epilepsy,

Queen Square Institute of Neurology;

University College London, Gower Street, London, WC1E 6BT, United Kingdom

* These authors contributed equally to this work.

ABCD Neurocognitive Prediction Challenge 2019: Predicting individual residual fluid intelligence scores from cortical grey matter morphology

Neil P. Oxtoby\scalerel* — 11**

Fabio S. Ferreira\scalerel* — 1122**

Agoston Mihalik\scalerel* — 1122

Tong Wu\scalerel* — 1122

Mikael Brudfors\scalerel* — 1133

Hongxiang Lin\scalerel* — 11

Anita Rau\scalerel* — 1133

Stefano B. Blumberg\scalerel* — 11

Maria Robu\scalerel* — 1133

Cemre Zor\scalerel* — 1122

Maira Tariq\scalerel* — 11

Maria Del Mar Estarellas Garcia 11

Baris Kanber\scalerel* — 55

Daniil I. Nikitichev\scalerel* — 1133

Janaina Mourao-Miranda\scalerel* — 1122

Abstract

We predicted residual fluid intelligence scores from T1-weighted MRI data available as part of the ABCD NP Challenge 2019, using morphological similarity of grey-matter regions across the cortex. Individual structural covariance networks (SCN) were abstracted into graph-theory metrics averaged over nodes across the brain and in data-driven communities/modules. Metrics included degree, path length, clustering coefficient, centrality, rich club coefficient, and small-worldness. These features derived from the training set were used to build various regression models for predicting residual fluid intelligence scores, with performance evaluated both using cross-validation within the training set and using the held-out validation set. Our predictions on the test set were generated with a support vector regression model trained on the training set. We found minimal improvement over predicting a zero residual fluid intelligence score across the sample population, implying that structural covariance networks calculated from T1-weighted MR imaging data provide little information about residual fluid intelligence.

Keywords:

Support Vector Regression Fluid Intelligence MRI Structural Covariance Networks Graph theory features

1 Introduction

Establishing the neurobiological mechanisms underlying intelligence is a key area of research in Neuroscience [1]. A strong correlation has been observed between cognitive ability measured at a very young age with the socioeconomic status [2], as well as longevity and health [3], at an older age. Moreover, intelligence has been shown to be very stable from young to old age in the same individuals [4][5]. Thus understanding the mechanisms of cognitive abilities has implications for health of the general population and can be used to enhance such abilities, for example through education or environment [6].

Neuroimaging plays a key role in advancing our knowledge of the neurological mechanisms of intelligence. Several brain-imaging studies have shown the link between brain features and intelligence, including a positive correlation with cortical volume and thickness, specifically in the frontal and temporal regions [7, 8, 9, 10, 11]. A link has also been observed between intelligence and the structural integrity of white matter [12] and the function integrity of the temporal, frontal and parietal cortices [13]. Studies have also involved both adult and children [14, 15]. The ABCD NP Challenge asks the question “How predictable is fluid intelligence from brain imaging data?” To answer this, we took a data-driven, exploratory approach of trying many models and image-based features — starting with a hackathon led by the UCL Centre for Medical Image Computing (CMIC). CMIC aims to make an impact on key medical challenges facing 21st century society through performing world-leading research on problems in medical imaging and image-analysis. Our expertise extends from feature extraction/generation through to image-based modelling [16, 17], machine learning [18, 19], and beyond. The hackathon took place one afternoon in February 2019 and involved researchers across research groups in UCL CMIC, in addition to colleagues from the affiliated UCL Wellcome Centre for Human Neuroimaging, UCL Department of Clinical and Experimental Epilepsy, and Max Planck UCL Centre for Computational Psychiatry and Ageing Research. Regular followup progress meetings followed the hackathon.

The brain is a complex organ widely touted as operating as a cliquish small-world network [20], although this may not be the whole story [21]. The ABCD NP Challenge lacks the diffusion MRI data necessary to estimate anatomical connectivity via tractography. However, it is possible to quantify morphological similarity of an individual’s cortex using a graph called a “structural covariance network” (SCN), which can be used to distinguish between clinical groups [22]. We calculate SCNs for each individual in the ABCD NP Challenge data set and input them as features to train predictive models of residual fluid intelligence (rFIQ).

The paper is structured as follows. The next section describes the challenge data and our methods. Section 3 presents our results which we discuss in section 4 then conclude.

2 Methods

2.1 Data

The ABCD NP Challenge data consists of a cross-section of imaging data and intelligence scores for children aged 9–10 years. The T1-weighted MRI data was acquired using the protocol detailed on the challenge website [23] and in [24], and split into training ( $N=3739$ ), validation ( $N=415$ ), and test ( $N=4515$ ) sets. The training and validation sets also include scores of fluid intelligence, which the ABCD Study measures using the NIH Toolbox Neurocognition battery [25]. For the challenge, fluid intelligence was residualized to remove dependence upon brain volume, data collection site, age at baseline, sex at birth, race/ethnicity, highest parental education, parental income, and parental marital status. While we understand the motivation — the challenge is to predict intelligence from imaging — this pre-residualization choice in the challenge design is somewhat limiting because it completely removes any ability to include covariance of these factors with image-based features. The MRI data provided was already in pre-processed form. Pre-processing included skull-stripping, removing noise, correcting for field inhomogeneities [26, 27] and affine alignment of all images to the SRI24 adult brain atlas [28]. The SRI24 segmentations and corresponding volumes were also provided. Unsurprisingly, the regional volumes were not predictive of a target that had been adjusted for total brain volume.

2.2 Structural Covariance Network Features

It has been shown that cortical morphology is predictive of cognitive deficits in individuals with Alzheimer’s disease [22]. We wanted to explore whether the same could be said for predicting intelligence, so we generated a structural covariance network (SCN) following [29] (code available on GitHub) for each individual in the ABCD NP Challenge data set. The SCN is a graph where the nodes are small cortical regions (3 voxels cubed) and the edges quantify structural similarity (morphology) between nodes. From each SCN we generated nodal graph-theory features using the Brain Connectivity Toolbox [30], which were then averaged across the brain and also within each of the largest three modules (communities) of the graph. We also considered measures of variation in these features (standard deviation and median-absolute deviation). Our 26 features include small-worldness, rich club coefficient, path length, node degree, clustering coefficient, and betweenness centrality (Table 1). See Figure 1 for a graphical representation of the pipeline.

Generating approximately ten thousand SCNs and corresponding graph-theory features is an intensive computational task. When the pipeline failed for a given individual, or time was not permitting (such as the late addition of 868 additional test subjects), this resulted in missing data. For these few individuals ( $\leq 8\%$ : Table 1) we inserted a prediction of zero (nominally the mean).

2.3 Predictive Models

We trained two models to predict rFIQ from features based on morphological similarity. The first was the event-based model (EBM) of progression [17, 31]. The second was support-vector regression (SVR) [32]. We trained each model on data from the training set, and assessed performance using MSE on the validation set (Table 1). The best-performing model (SVR) was used to generate our submission to the challenge: predictions for the test set.

The EBM learns a discrete sequence of progression events from normal/low state to abnormal/high. It was designed for neurodegenerative diseases but can be applied to any monotonic phenomenon. Here we define low rFIQ as more than one standard deviation (std) below the mean and high rFIQ as more than one std above the mean. If rFIQ is a monotonic function of structural covariance, then the EBM should be able to find a probabilistic sequence of events that represent this function. “Events” are structural covariance graph-theory features, and they must differ statistically between low-rFIQ and high-rFIQ for them to be included in the model — otherwise they contain no “signal” for this trajectory. We excluded features that “did not pass” ( $p>0.10$ ) the Mann-Whitney U test of the null hypothesis that the distributions (low/high rFIQ) are equal. EBM stage and rFIQ score was input into a Kernel Ridge Regression model (default parameters, scikit-learn: [33]) to make the predictions.

The SVR was run in PRoNTo version 3 (Pattern Recognition for Neuroimaging Toolbox) [34, 35] — a software toolbox of pattern recognition techniques for the analysis of neuroimaging data. Model performance on the training set was assessed using 5-fold nested cross-validation (i.e. the internal and external loops had 5 folds) to optimise the penalty parameter C (we use 6 different logarithmically-spaced values: 0.01, 0.1, 1, 10, 100 and 1000) and compute the MSE per fold, which were averaged across folds to compute the final prediction error (Table 2).

3 Results

We included 26 SCN graph-theory features that represent morphological similarity across the cortex. Table 1 summarises the features we derived from the T1 images, and the level of completeness in each challenge data set (see Section 2.3). For the EBM, only three features passed through our Mann-Whitney U test filter (see Methods): small-worldness, betweenness centrality (median), and degree. Even for these features, there was very little difference between the low- and high-rFIQ groups (see Table 1), with Cohen’s d effect sizes of $-0.11/0.06$ (small-world), $0.07/-0.10$ (degree), and $-0.09/0.009$ (centrality) in the training/validation sets. In light of the opposing effect direction (signs), the model’s poor generalisation performance is unsurprising (see Table 2).

For the SVR model, two features were most important: small-worldness (weight $w=11.42$ ); and clustering coefficient in community 2 ( $w=6.04$ ). Among the next most important were average path length and other clustering coefficients.

Table 2 shows our prediction results for both models: mean-squared errors and Pearson’s squared correlation coefficient for training and validation. It is clear that both the approaches did not generalise well under validation. Our submission to the challenge (SVR) was positioned near the middle of the testing leaderboard with $\mathrm{MSE}=93.8335$ .

4 Discussion

The ABCD NP Challenge was certainly challenging. Our MSE for predicting residual fluid intelligence was only nominally better than simply predicting zero, i.e., the mean. This implies that the residual fluid intelligence is not explainable by graph theory features derived from structural covariance networks. We found similar results for all combinations of models and features attempted during and after our hackathon — from basic regression to deep learning. Moreover, the validation leader board (see challenge website) demonstrated that other entries into the challenge had similarly meagre performance improvement on simply predicting the mean.

While the residualization process precluded the use of models that include covariance of the residualization factors [18] with image-based features, it is difficult to say whether or not this would have improved the results dramatically. Including variables in the residualization procedure that are correlated with the predicted variable is likely to remove important variability in the data leading to predictive models with low performance [36].

5 Conclusion

Based on our results, and those on the validation leaderboard for the challenge, we are inclined to conclude that structural imaging is probably incapable of predicting more than a couple of points worth of residual fluid intelligence.

Bibliography36

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Goriounova, N.A., Mansvelder, H.D.: Genes, cells and brain areas of intelligence. Frontiers in Human Neuroscience 13 , 44 (2019). https://doi.org/10.3389/fnhum.2019.00044
2[2] Foverskov, E., Mortensen, E.L., Holm, A., Pedersen, J.L.M., Osler, M., Lund, R.: Socioeconomic position across the life course and cognitive ability later in life: The importance of considering early cognitive ability. Journal of Aging and Health (2017). https://doi.org/10.1177/0898264317742810, p MID: 29254458
3[3] Lam, N.H., Borduqui, T., Hallak, J., Roque, A.C., Anticevic, A., Krystal, J.H., Wang, X.J., Murray, J.D.: Effects of Altered Excitation-Inhibition Balance on Decision Making in a Cortical Circuit Model. bio Rxiv 100347 (2017). https://doi.org/10.1101/100347
4[4] Deary, I.J., Strand, S., Smith, P., Fernandes, C.: Intelligence and educational achievement. Intelligence 35 (1), 13–21 (2007). https://doi.org/10.1016/j.intell.2006.02.001
5[5] Deary, I.J., Pattie, A., Starr, J.M.: The Stability of Intelligence From Age 11 to Age 90 Years: The Lothian Birth Cohort of 1921. Psychological Science 24 (12), 2361–2368 (2013). https://doi.org/10.1177/0956797613486487
6[6] Gottfredson, L.S.: Why g matters: The complexity of everyday life. Intelligence 24 (1), 79–132 (1997). https://doi.org/10.1016/S 0160-2896(97)90014-3
7[7] Hulshoff Pol, H.E., Schnack, H.G., Posthuma, D., Mandl, R.C.W., Baaré, W.F., Van Oel, C., Van Haren, N.E., Collins, D.L., Evans, A.C., Amunts, K., Bürgel, U., Zilles, K., De Geus, E., Boomsma, D.I., Kahn, R.S., Vogt, O.: Genetic contributions to human brain morphology and intelligence. Journal of Neuroscience 26 (40), 10235–10242 (2006). https://doi.org/10.1523/JNEUROSCI.1312-06.2006
8[8] Narr, K.L., Woods, R.P., Thompson, P.M., Szeszko, P., Robinson, D., Dimtcheva, T., Gurbani, M., Toga, A.W., Bilder, R.M.: Relationships between IQ and Regional Cortical Gray Matter Thickness in Healthy Adults. Cerebral Cortex 17 (9), 2163–2171 (2007). https://doi.org/10.1093/cercor/bhl 125