Latent space-based network analysis for brain–behavior linking in neuroimaging
Selena Wang, Xinzhi Zhang, Yunhe Liu, Wanwan Xu, Xinyuan Tian, Yize Zhao

TL;DR
LatentSNA is a new method for analyzing brain imaging data that improves the detection of biomarkers and their links to behavior by using a Bayesian network approach.
Contribution
LatentSNA introduces a Bayesian framework for network analysis that enhances statistical power and reduces type II errors in neuroimaging biomarker detection.
Findings
LatentSNA improves biomarker detection accuracy by 110–150% compared to existing methods.
The method increases replicability by 153% in moderate to large datasets.
LatentSNA is broadly applicable across multiple imaging modalities and diverse participant cohorts.
Abstract
We propose a latent space-based statistical network analysis (LatentSNA) method that implements network science in a generative Bayesian framework, preserves neurologically meaningful brain topology and improves statistical power for imaging biomarker detection. LatentSNA (1) addresses the lack of power and inflated type II errors in current analytic approaches when detecting imaging biomarkers, (2) allows unbiased estimation of the influence of biomarkers on behavioral variants, (3) quantifies uncertainty and evaluates the likelihood of estimated biomarker effects against chance and (4) improves brain–behavior prediction in new samples as well as the clinical utility of neuroimaging findings. LatentSNA is broadly applicable across multiple imaging modalities and outcome measures in developing, aging and transdiagnostic cohorts, totaling 8,003 to 11,861 participants. LatentSNA achieves…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8- —Alzheimer's Disease Data Initiative, 2025 William H. Gates Sr. Fellowship National Institutes of Health (NIH) grants P30AG072976.
- —National Institutes of Health (NIH) grants R01AG068191, RF1AG081413, R01EB034720 and P30AG021342.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFunctional Brain Connectivity Studies · Mental Health Research Topics · Face Recognition and Perception
Main
Neuroimaging encompasses techniques that provide in vivo depiction of the anatomy and function of the central nervous system to study the human brain in a noninvasive manner. Some imaging techniques focus on the structure of the brain (for example, computerized axial tomography and diffusion tensor imaging (DTI)), while others allow us to characterize brain activity or function, for example, functional magnetic resonance imaging (fMRI) and positron emission tomography (PET). A major hurdle for modeling neuroimaging data is the highly correlated and connected nature of measurements throughout the brain, not dissimilar to networks^1^, which contributes to low statistical power for identifying brain–behavior links. Given the networked nature of the brain, a marriage between network science, a complexity-driven discipline focused on the shared architecture of networks emerging across physical, biological and social domains^2^, and neuroimaging analysis is needed.
Neuroimaging connectivity models recognize and select meaningful patterns from neuroimages that explain individual differences in behavior, cognition and other outcomes. For example, case–control comparisons measure differences in connectivity between healthy individuals and patients to identify markers of dysfunction^3^. Univariate and marginal association analyses calculate associations between connectivity and outcomes to identify links^4^. By vectorizing unique pairwise edges from symmetric functional connectomes, connectome-based predictive modeling (CPM)^5^ achieves functional imaging biomarker detection using a multivariate regression model controlling overfitting with cross-validation. Machine learning algorithms such as Ridge^6^, least absolute shrinkage and selection operator (Lasso)^6^, support vector machines (SVM)^7^, random forest (RF)^8^ and convolutional neural networks (CNNs)^9^ are integrated to improve the predictability of the connectivity model for individual outcomes.
A critical challenge remains: connectivity edges are treated as independent observations, whereas evidence supports the dependent organization of brain networks as informative neurobiological indicators^10^. Graphical models, consisting of both undirected Gaussian graphical models^11^ and directed acyclic graphs^12^, describe the conditional dependence among random variables and directly address the violation of the independence assumption. A key task of graphical models, when applied to neuroimaging data, is to estimate and create brain connectivity networks based on whether signals from two brain regions are conditionally independent of each other^13^. Although individual behaviors and outcomes can be incorporated in graphical models, they are often used to influence the estimation of brain connectivity networks^14^. By contrast, LatentSNA aims to understand the structure and property of brain networks (not their estimation) and how its structure is related to individual behaviors and outcomes. For further exploration of the differences between these two methodologies, we refer to these discussions^2,15^.
What differentiates a network science-driven analytic approach is that it draws on insights regarding the universality of the communicative structures of real-world networks^1^. Characteristics such as the small-world property and sparsity are universal properties found in social networks^15^, political networks^16^, the World Wide Web^17^ and human connectomes^18^. Shared network architectures, as a result of being governed by universal principles^15^, allow us to use a common set of mathematical and statistical instruments for network modeling. Network science is characterized by mathematical investigations about the universal principles of network generation: what mathematical principles define the generations of network with power law degree distributions^1^. This discipline may have overlap with neuroimaging connectivity analysis, although they are not the same. For example, CPM analyzes neuroimaging connectivity data (is a neuroimaging connectivity analysis method), but it does not incorporate the networked (dependent) characteristics of the brain when modeling brain connectivity edges: it assumes one connectivity edge to be an independent observation from another.
LatentSNA, an inference-focused generative Bayesian framework capturing universal network topologies and leveraging latent space estimation techniques, is designed to analyze human connectomes and identify meaningful neuroimaging biomarkers of individual outcomes (Fig. 1). It comprises an integrated workflow containing three modules: networked connectome modeling (preserving transitivity and modularity), psychometric behavior profiling and two-way brain–behavior linking. We achieve robust neuroimaging biomarker detection with markedly improved statistical power, demonstrating generalizability of the method across seven neuroimaging landmark studies: Alzheimer’s Disease Neuroimaging Initiative (ADNI) Grand Opportunities, ADNI Phase 2 (ADNI-GO/2) and ADNI Phase 3 (ref. ^19^), Anti-Amyloid Treatment in Asymptomatic Alzheimer’s Disease (A4)^20^, the Human Connectome Project in Aging (HCP-A)^21^, Adolescent Brain Cognitive Development Study Baseline (ABCD-B) and second release (ABCD-2)^22^ and transdiagnostic data collected at Yale^23^. These studies involve eight different imaging modalities and 20 outcome measures with a total of 8,003 to 11,861 participants. LatentSNA consistently improves model fit performance over nine established methods, including three deep learning techniques (SVM, RF and CNN), two network-based brain analysis methods including penalized graph classification (GC^24^) and tensor network factorization analysis (TNFA^25^) and four popular brain–behavior linking approaches such as CPM, ridge CPM, Lasso and canonical correlation analysis (CCA). It enhances the predictability (an average of 110% improvement over TNFA and an average of 150% improvement over CPM) and replicability (averaging 153% improvement over CPM) of various imaging techniques, including fMRI, T1-weighted structural MRI (sMRI), DTI and PET. Moreover, it is generalizable to different outcome measures, including but not limited to cognition, emotion, assessment of mental disorders, focal tau PET ([^18^F]flortaucipir) standardized uptake value ratio (SUVR) metrics and different participant demographics across developing, aging and transdiagnostic populations.Fig. 1. Schematic diagram of LatentSNA.The LatentSNA Bayesian diagram demonstrates a holistic model for multivariate outcomes Y and brain networks X. Neuroimaging and multivariate behavior data are input into the LatentSNA model, which subsequently goes through an iterative MCMC algorithm that estimates the model parameters theorizing the data generation process of three interconnected components. a–c, These three interconnected components consist of psychometric behavior profiling (a), latent space network modeling (b) and brain–behavior linking (c). a, LatentSNA allows multivariate modeling of a latent behavior variable (for example, internalizing psychopathology) with multiple variables (for example, anxious–depressed, withdrawn–depressed and somatic complaints) to improve precision. The observed psychopathology is generated following a modified version of a psychometric Rasch model^61^, in which outcomes are decomposed into item and person components. b, LatentSNA uses the symmetric bilinear interaction effect to capture network topology (transitivity, balance and clusterability)^38^. c, LatentSNA infers relationships between the brain and behavior, for example, internalizing psychopathology and functional connectivity. We propose a joint latent variable model, in which we allow the latent connectivity variables Z and the latent behavior variables θ to covary with a shared covariance matrix, Σ. L, left; R, right.
As a result, our proposed method can substantially improve the interpretability of current neuroimaging connectivity studies, for example, providing a view of how brain network topology is implicated in brain–behavior relationships, exemplified by the ABCD study. Large-scale disruptions in the functional communicative patterns of brain connectomes across multiple interconnected functional systems are found to explain differences in internalizing symptoms among children^26^. Starlike topological architectures, known for their efficiency in information dissemination, resiliency with local transmission failure and affiliation with congestion^27^, are identified as the fingerprints of internalizing psychopathology and its deterioration in children. Overall, LatentSNA demonstrates high-quality fit to various imaging data, generates scientific insights and enriches discussions surrounding existing neuroscience questions.
Results
Conceptual framework
Motivated by the need to enhance the power for identifying neuroimaging biomarkers, we propose LatentSNA as a generative statistical network analysis (SNA) model to identify significant links between brain networks and behavioral traits (Fig. 1). Existing SNA models often analyze brain–behavior links as one-sided regression models. These models either analyze (reduced-dimension) brain connectivity as predictors in a regression with behavior as the response^28,29^ or they analyze connectivity as the response in a matrix-response regression to quantify behavioral covariate effects^30^. However, both types of models lack the ability to capture the mutual variations between behavioral profiles and brain variations, that is, brain development influences children’s behavior and abnormal behaviors potentially reinforce brain abnormalities due to brain plasticity^31^. By contrast, LatentSNA allows connectivity differences to inform behavior–outcome variations and vice versa: both brain connectivity (Fig. 1b) and individual outcomes (Fig. 1a) are the targeted modeling interests. LatentSNA is ideal for detecting complicated and potentially noisy and weak signals hidden in high-dimensional functional connectivity data, for example, high heterogeneity and strong motion artifacts in children’s fMRI data^32^. LatentSNA reinforces potentially weak signals in connectivity with a two-way cross-sectional brain–behavior linking module (Fig. 1c) that allows true connectivity signals and true internalizing signals to mutually inform each other, thus strengthening connectivity signals. Additionally, LatentSNA partials out random noise variations from true signal variations to further reinforce potentially weak connectivity signals.
Second, focused on inferring the relationships between brain networks and behaviors, LatentSNA is, philosophically, an inference model (also called explanatory model), not a prediction model^33,34^. LatentSNA provides uncertainty quantification for biomarker detection and robust statistical inference under the Bayesian framework (Fig. 1a–c). Inference models are built to describe how potential predictors and explanatory variables explain individual differences in responses, while prediction models ignore this process and focus on accurately predicting future responses. Inference models rely on statistical theories such as the central limit theorem and large sample properties to derive unbiased estimates of the significant effect coefficients with controlled type I error, while prediction models often introduce biases to improve prediction. Inference models are more optimal for detecting imaging biomarkers, as they allow us to quantify the uncertainty associated with the identification of imaging biomarkers, which is not possible with prediction models. With a large enough sample size, our model can, in an unbiased manner, identify true mutual relationships between the connectivity of each region and individual outcomes with high enough power and controlled type I error. Meanwhile, machine learning methods such as Lasso^35^ do not offer unbiased quantification of the relationships and suffer from low power and inflated type II errors.
Third, LatentSNA builds on the statistical network modeling literature and preserves the topological structure of the brain network. Higher-order dependencies in real-world networks are defined as dependencies among three (triad) or more nodes^36^. Common examples of higher-order dependencies in real-world networks include homophily, balance and clusterability^37^. Homophily is often associated with the transitive property of a network, explaining how new connections are established based on existing connections, also known as transitivity. Balance suggests a state of harmony, in which positive connections are found among nodes with similar attributes and negative connections are found among nodes with divergent attributes. Clusterability represents a more relaxed criteria for harmony than balance^38^. With balanced cycles among triads, the entire network can be divided into cohesive groups with xu,v > 0 if nodes u and v are in the same group and xu,v < 0 if they are in opposite groups^38^. Therefore, the presence of higher-order dependencies such as balance contributes to relational patterns and topology across the whole network, including higher-order dependencies. By modeling higher-order dependencies, the proposed LatentSNA captures relational patterns across the entire network.
Bilinear effects account for transitive, balanced and clusterable network structures^39^. Vector product-based latent space models, which include bilinear effect models, capture higher-order dependencies such as homophily, balance and clusterability^39^. Furthermore, such models show satisfactory model fit for networks with varying degrees of transitivity and clusterability^40^. Given that brain functional networks possess small-world properties^18^, likely exhibiting both transitivity and clusterability, it is optimal for us to use bilinear effects to model higher-order dependence structures. Consequently, LatentSNA captures how network topology is implicated in brain–behavior relationships.
Finally, LatentSNA offers powerful predictions of both connectivity and behavioral variants. We provide a predictive mechanism for behavior based on connectivity, which simultaneously serves as a predictive mechanism for connectivity based on behavioral variants. Accurate prediction is achieved by incorporating latent variables to separate signal from noise, using joint modeling frameworks and allowing information communication between behavior and connectivity during model estimation. Additionally, preserving the topology of brain networks and capturing complex dependence structures is not possible with simple linear additive models.
Assessment of LatentSNA using generated data
We compared LatentSNA to CPM, Lasso and CCA, a multivariate method exploring possible dependencies between datasets. The comparison was conducted with varying sample sizes, network sizes, signal-to-noise ratios in brain connectivity and different levels of relationships (signal proportions) between connectivity and behavior (Fig. 2). Based on both power and specificity, LatentSNA shows the highest success rate for recovering true relationships and true null relationships, making it the most sensitive and accurate method for identifying imaging biomarkers. The relatively low power observed via CPM reflects the general challenges associated with identifying imaging biomarkers when fMRI data are noisy and relationships between connectivity and behavior are sparse. To reduce prediction error, Lasso introduces a penalty term in the loss function, inducing downward bias in the coefficient estimates, and, unsurprisingly, reports the lowest power. The high specificity of Lasso is likely a byproduct of the downward bias in parameter estimation. By comparison, CCA exhibits higher power than Lasso and CPM when there are more relationships between connectivity and behavior. Using CCA, we find linear combinations of variables on both sides that maximize the dependence between the two, making CCA more powerful when the dependence is strong. Meanwhile, CCA reports low specificity when the signal proportions are large, suggesting that CCA tends to overidentify effects with high type I error when the relationships between connectivity and behavior are numerous.Fig. 2. LatentSNA improves statistical power for biomarker detection.a,b, Bar plots comparing the specificity (a) and power (b) of LatentSNA with CPM, Lasso and CCA in different data situations across 100 replications; replication is generated data. From left to right, the sample size increases from 500 to 1,000 to 2,000. From top to bottom, we include small (V = 20) and large (V = 70) networks as well as signal-to-noise ratios (SNR) of 0.5. We show 25–75% quantiles as error bars to reflect the uncertainty. c, The recovery of biomarkers versus null effects using LatentSNA versus marginal correlation tests. Box plots show the centra and 25% and 75% quantiles of the estimated effects using LatentSNA versus marginal association analysis.
To assess whether the improved power for detection translates to better prediction accuracy of behavior, we report the estimated correlation between the predicted and observed behavior in randomly sampled test data (Supplementary Fig. 3). LatentSNA demonstrates the highest prediction accuracy for behavior across various data scenarios, with accuracy increasing as the relationship between brain connectivity and behavior strengthens and as the sample size grows. Additionally, we provide the prediction accuracy of connectivity using LatentSNA (Supplementary Figs. 1–3). With LatentSNA’s dual-predictive capability, we can robustly predict the connectivity of each testing sample based on behavior information. By contrast, the comparison averaging method uses the sample average connectivity as a prediction for a new participant’s connectivity. In both prediction tasks, LatentSNA reliably predicts connectivity networks and behavior in new samples, particularly when connectivity and behavior are strongly related.
Assessment of LatentSNA using real-world data
To demonstrate the generalizability of our method across studies, we have focused on two aspects. First, our method fits well to a range of datasets with strong out-of-sample prediction accuracy. Second, our method can show robust and replicable results with consistent effect estimation across random samplings within the same study. For both predictability and replicability, we applied LatentSNA to seven different datasets, involving eight different imaging modalities and 20 outcome measures, with information from a total of 8,003–11,861 participants included. We demonstrate our method via an investigation of the ABCD study, which offers neurological insights and a view of how network topology is implicated in brain–behavior relationships during neurodevelopment.
Improved model performance is observed across imaging modalities, outcome measures and population demographics
To evaluate the accuracy of predicting individual outcomes in independent samples, we apply LatentSNA to multiple landmark neuroimaging studies (Fig. 3). We demonstrate broad applicability of the model in predicting various types of outcome measures, including cognition, emotion, assessments of mental disorders and focal tau PET SUVR metrics, using imaging modalities such as structural imaging measuring fiber density, number of fibers and fiber length as well as resting and task state fMRIs. The proposed method consistently shows improvement in model fit across different data scenarios compared to seven available connectivity models: penalized GC, TNFA, CPM, ridge CPM, SVM, RF and CNN. Across varied imaging modalities, outcome measures and populations, our method consistently outperforms existing alternatives, demonstrating, for example, an average improvement of 110% over TNFA and 150% over CPM (Supplementary Figs. 1–3). These results validate LatentSNA as an interesting adaptation of statistical network analysis concepts and methods for linking real-world networks to brain and behavior.Fig. 3. LatentSNA shows substantially improved predictive performance over existing approaches.a, Model performance observed across imaging modalities, outcome measures and population demographics. We include data from ADNI Grand Opportunities, ADNI-GO/2, ADNI Phase 3, A4, HCP-A and transdiagnostic data collected at Yale. The model is fitted to each imaging modality and outcome measure with sample size outlined in Supplementary Table 1. Box plots show the centra and 25% and 75% quantiles of prediction accuracy using LatentSNA versus other methods. ADAS, Alzheimer’s Disease Assessment Scale; ECog, Everyday Cognition Scale; BSI, Brief Symptom Inventory; PACCN, Preclinical Alzheimer’s Cognitive Composite; nBack, emotional N-back task. b, Box plots show the centra and 25% and 75% quantiles of correlations between observed and predicted internalizing psychopathology for RS (blue), MID (light blue), SST (yellow) and EN-Back (red) for 100 test participants across ten runs. From left to right, predictive correlations based on CPM, LatentSNA (Z) and LatentSNA (θ) are reported. c, Bar plots display correlations between observed and predicted connectivity for 100 test participants during resting and task conditions (same color scheme as before). Error bars represent the range between the 25th and 75th quantiles of prediction accuracy. From left to right, predictive correlations based on full networks, networks of the top ten internalizing regions and networks of the top five internalizing regions are reported. d, Scatterplot predicts internalizing values in independent samples using connectivity via the marginal correlation test. e, Scatterplot predicts internalizing values in independent samples using connectivity via LatentSNA.
The lack of robustness and replicability of current fMRI studies is a well-known challenge^41,42^. We investigated the robustness and replicability performance of our proposed method by comparing covariance effect estimation across random samplings of the test data, that is, replicability with the same data. We calculated the absolute correlation of the estimated effects between replications (Fig. 4). Our model-estimated effects (covariance and/or correlations between brain and behaviors) were consistent across independent replications when we randomly split the data into 90% training and 10% test samples. The CPM, on the other hand, shows lower replicability and robustness. LatentSNA shows consistently higher replicability (with correlations above 0.75) across datasets, while CPM shows substantially lower and more variable reproducibility (correlations ranging from 0.25 to 0.75).Fig. 4. Reproducibility analysis of CPM and LatentSNA methods across different behaviors, fMRI conditions and datasets.Box plots show the centra and 25% and 75% quantiles of reproducibility using LatentSNA versus other methods. Each box plot summarizes reproducibility performance through correlations across ten independent replications. We use the absolute correlation between the estimated effects (regression coefficient estimates in CPM (green) and covariance estimates in LatentSNA (red)) across replications to represent the replicability and robustness of both methods. Higher correlation values indicate better reproducibility. f, resting state functional connectivity.
LatentSNA accurately predicts internalizing psychopathology and connectivity in independent samples
We apply LatentSNA to multivariate internalizing profiles and functional connectivity during the emotional n-back task (EN-back), the stop signal task (SST) and the monetary incentive delay (MID) task conditions as well as the resting state (RS) for 5,000 to 7,000 children in the baseline ABCD study. Our aim is to uncover functional fingerprints under different cognitive states for childhood internalization, replicate the results and investigate alterations in the fingerprints between different task and resting conditions. We show the prediction accuracy results for the ABCD baseline study (Fig. 3b). We randomly split the dataset into training and test sets with ten random splits, each with a test sample size of 100 to maintain consistency across task and resting conditions. LatentSNA (θ) shows median correlation above 0.9 between observed and predicted internalizing information in all four cognitive states, and LatentSNA (Z) shows correlations between 0.6 and 0.8. On the other hand, CPM only provides correlations around or below 0.1. This strongly supports the advantage of LatentSNA in dissecting reliable predictive information from functional connectivity under each cognitive state. Through those constructed joint learning mechanisms by LatentSNA, we can effectively predict internalizing profiles for new participants based on the available functional connectivity data.
To assess the prediction accuracy of functional connectivity, we fitted LatentSNA to training data and calculated the correlation between the observed and the estimated average connectivity (Fig. 3c). For 100 test participants, LatentSNA reports a median correlation of 0.502 for the recovery of whole-brain connectivity, 0.557 for connectivity among the top ten risk-internalizing regions and 0.707 for connectivity among the top five risk-internalizing regions. This result shows that LatentSNA provides sufficiently accurate prediction of the connectivity measurements, posing a unique opportunity to uncover brain connectivity for new participants incorporating their internalizing measures.
We compared model-identified biomarkers with null effects based on how much they can explain individual differences in internalizing psychopathology in new samples (Fig. 3d,e). We show scatterplots predicting internalizing values in independent samples using the sum of significant connectivity edges via marginal univariate tests (Fig. 3d) and connectivity via LatentSNA (Fig. 3e). The marginal test cannot differentiate the imaging biomarker from the null effect, as both predictive models show close to zero predictability for internalizing values. By contrast, LatentSNA is able to differentiate the imaging biomarker from the null effect based on predictive R^2^ values. This result demonstrates that the estimated connectivity in LatentSNA differentiates biomarkers from null effects. The information flow between connectivity and internalizing, data integration, allows model-estimated brain connectivity to be informed by internalizing in a data-driven manner.
LatentSNA learns whether a relationship exists between the functional connectivity of a brain region and internalizing psychopathology and correctly uses or ignores that relationship depending on whether it exists. In this manner, estimated latent connectivity variables contain varying degrees of internalizing information, and a connectivity region contains more internalizing information when it is significantly linked with internalizing psychopathology and less when it is not linked with internalizing psychopathology.
Within the LatentSNA framework, we modeled internalizing psychopathology as an abstract latent construct (or variable) underlying three observed dimensions of internalizing psychopathology: the anxious–depressed, withdrawn–depressed and somatic complaint dimensions^43^. These three dimensions represent three conceptually distinct but complementary manifestations of internalizing psychopathology. Thus, by directly incorporating these dimensions into LatentSNA, we allowed more information to be included than if the internalizing psychopathology were simply modeled as the sum scores of the three dimensions, as is common in the current literature. To assess whether the incorporation of dimensions has improved modeling, we also modeled internalizing as the sum scores using LatentSNA and compared the fit. We see improvements in predicting internalizing from 0.825 to 0.885 for MID, from 0.886 to 0.893 for SST, from 0.846 to 0.901 for EN-Back and from 0.791 to 0.846 for RS when multiple dimensions are directly incorporated. This result suggests that incorporating multiple dimensions of psychopathology is superior to modeling internalizing sum scores.
Large-scale disruptions of multiple functional systems are consistently found with internalizing psychopathology across cognitive conditions
We report the number of significant regions identified in each of the ten functional systems^44,45^; we show the corresponding 95% credible intervals, the uncertainty quantification obtained from Markov Chain Monte Carlo (MCMC) under Bayesian inference of covariance estimates for each of the 268 brain regions (Fig. 5a,b). Across three task conditions, we found consistent involvement of 131 of the 268 brain regions and seven of ten functional systems in internalizing psychopathology, supporting internalizing psychopathology as a complex and involving large-scale affective interference of multiple coordinating functional systems. While existing psychopathology literature indicates the involvement of functional systems such as the default mode network, the prefrontal cortex, the amygdala and other structures^46^, rarely do the studies have large enough power to test the disruption across the whole brain and support large-scale involvement. Using LatentSNA, we were able to identify and replicate this involvement with other task conditions.Fig. 5. Large-scale disruptions of multiple functional systems are consistently found with internalizing psychopathology across cognitive conditions.Sample size is outlined in Supplementary Table 1. a, Radar plot showing the number of identified brain regions associated with ten functional systems for EN-Back, SST, MID and RS. b, The 95% credible intervals of covariances between the connectome and internalizing behaviors for each brain region. The centra is the estimated posterior mean of the covariance. The connectivity edges that show substantial differences between task states and resting state are highlighted in red boxes. MF, medial–frontal; FP, fronto-parietal; DMN, default mode; MOT, motor; VI, visual I; VII, visual II; VAs, visual association; LIM, limbic; BG, basal ganglia; CBL, cerebellum.
LatentSNA reveals a shared set of functional architectures attributable to individual variations in internalizing psychopathology when participants are tasked to perform different emotional and cognitive tasks. This finding corresponds to a recent ABCD study showing similar predictive brain features for various cognitive, personality and mental health scores^47^. During the MID task, functional connectivity shows the strongest relationship with psychopathology with the highest average covariance estimates (Supplementary Fig. 6). We show consistent discrimination of the functional systems and their contributions to developing internalizing psychopathology across tasks (Fig. 5a). While the motor system, the medial–frontal system, the basal ganglia system, the limbic system, the default mode network and the visual I systems are consistently found to be implicated in internalizing psychopathology, there is also a consistent lack of implications of the fronto-parietal, visual II and visual association systems.
The functional architectures of internalizing psychopathology are different for an intrinsic brain versus an active brain. While current literature supports the existence of an intrinsic functional brain during rest with a set of small changes common across tasks^48^, little is known about differences in the functional architectures of internalizing psychopathology under different cognitive states. Our results show evidence for a difference in affective interference between RS and task states due to internalizing psychopathology. Different functional connectivity architectures are found to be implicated in internalizing psychopathology between rest and task states.
During RS, three functional systems emerge as the top risk ones to explain individual variations in internalizing psychopathology: the cerebellum, visual I and visual association systems. The cerebellum plays an important role in social and emotion processing^49^, and abnormalities are found in the cerebellum during rest for individuals with depression^50^ and schizophrenia^51^. Our results suggest that, during rest, the cerebellum is a major functional system contributing to internalizing psychopathology, and its relationship to internalizing is specific to an intrinsic brain, not when the brain is active. Individual differences in the spontaneous functional activities of the RS visual network, including visual I and visual associations, are also related to individual differences in internalizing psychopathology across individuals.
The core–periphery functional network feature is more pronounced with LatentSNA
LatentSNA differentiates signal from noise in functional connectivity networks via latent variables. Different from random noise, latent variables capture patterns of meaningful variations in functional signals across individuals. In LatentSNA, each brain region is allowed to exhibit different levels of variations in functional signals across individuals and different levels of association with internalizing psychopathology. We captured true signal variations in reduced dimensions that are much smaller than the dimensions of the network, and we projected these reduced dimensions back to the network dimensions. In this manner, we obtained the latent connectivity network capturing true variations of functional signals distinct from noise. We reported observed versus estimated latent connectivity for an average participant in the MID condition (Fig. 6b). Latent connectivity shows a different topological structure than the observed network.Fig. 6. LatentSNA identifies biologically meaningful imaging biomarkers with strong anatomical associations.a, Brain surface plots colored by the intensity of biomarker effects. We plot estimated covarying relationships between the connectivity and internalizing of each brain region. b, Heatmaps of observed connectivity (top) and latent connectivity (bottom) for an average participant during the MID condition. Default Mode (DMN), Medial Frontal (MF), Fronto-parietal (FP), Motor (MOT), Visual I (VI), Visual II (VII), Visual Association (VAs), Limbic (LIM), Basal Ganglia (BG), and Cerebellum (CBL). c, Densities of the centrality of brain regions measured by degrees and closeness based on the latent (L) network (left) and the observed network (right) for an average participant during the MID condition.
We show densities of the node strength and closeness based on the latent network and the observed network for an average participant in the MID condition (Fig. 6c). The distributions of node strength, for both the latent network and the observed network, are approximately symmetric based on the d’Agostino skewness test^52^. The observed network shows a platykurtic distribution with significantly negative kurtosis (P < 10^−5^, Anscombe–Glynn kurtosis test^53^), while the latent network fails to reject the null. Negative kurtosis suggests that the node strength has a flat distribution with thin tails. By comparison, the latent network has more node strength in the tails with more extremely active and extremely dormant regions. Closeness, for both the latent network and the observed network, is positively skewed with highly positive kurtosis. Compared with the observed network, the latent network shows larger skewness and kurtosis. Centrality measures show that the latent network more strongly discriminates core and peripheral regions, reflecting a more pronounced core–periphery differentiation optimal for communication, parallel to those of an efficient information distribution system^54^.
By preserving the topological structure of the brain, it is not surprising that our identified imaging biomarkers are biologically meaningful and show strong associations with anatomical structures of the brain (Fig. 6a). Based on LatentSNA, the strongest biomarker signals (with covariance estimates above 0.4 and top six biomarker regions) come from the cuneus, the middle frontal gyrus, the middle temporal gyrus, the superior temporal gyrus and Heschl’s gyrus. These regions play key roles in brain functions and are the central actors of the overall brain network connectivity.
Functional architectures of internalizing psychopathology are driven by the core actors of the connectivity network
We report node strength for regions of the motor system (Extended Data Fig. 1a) and other systems (Supplementary Fig. 7). We also show node strength, closeness and betweenness for all regions (Supplementary Fig. 7). The results show that malfunctions associated with internalizing psychopathology are driven by the core actors of the connectivity network. The location and connectivity edges of top imaging biomarker are compared against those of the null effect (Extended Data Fig. 1b,c). Core regions with high levels of connectivity across the whole brain contribute to individual differences in internalizing psychopathology. Compared to null effects, imaging biomarkers are the central actors of the functional network with high node strength and high closeness: they are able to transmit a large quantity of information effectively. Development of internalizing psychopathology relies on regions that transmit large quantities of information (high strength) efficiently (high closeness). Low-strength and high-closeness regions are not identified as biomarkers: they tend to be the peripheral actors of the network with only localized connectivity edges.
Internalizing psychopathology in children is attributable to starlike functional networks
As the brain is divisible into many coordinating functional systems with distinct connectivity architectures and topology, we report the latent internalizing networks with significant internalizing biomarkers and their connectivity edges in each functional system in Extended Data Fig. 2. Starlike structures emerge across functional systems. These starlike structures consist of a few core actors (stars) with many links and many peripheral actors with a few links. The star nodes are almost completely connected with each other, forming a central clique, and almost all peripheral nodes are connected with the star nodes. The starlike structure corresponds to the rich club structure often found with brain networks^54^. In both structures, there are preferential connectivities among core regions. Different from the rich club structure, in the starlike structure, peripheral regions and central regions are efficiently linked with short path distances, and the peripheral regions are rarely linked to each other with low probability of connections. The starlike configurations contribute to the core–peripheral structure in the latent functional network and the skewness of the centrality distributions.
The starlike structure is consistent with the current literature on our lack of efficiency when multitasking. The starlike structure is cheap to assemble with a small number of edges and efficient searchability^55^. In an ideal one-star network, all peripheral actors are linked with the star, and there is no peripheral-to-peripheral edge. The number of steps to reach an actor in the network is always two, regardless of the network size, making the one-star network the most optimal for communication when only one information search is performed at one time. However, search on the polarized starlike networks quickly becomes expensive when multiple searches occur simultaneously due to congestions at the star nodes.
Our star structure theory provides validity evidence for the current multitasking literature, which supports the idea that the brain is prone to congestions when multiple mental tasks are to be performed^56^. The highly connected star regions and their central cliques such as the fronto-parietal control and dorsal attention systems are crucial for completing goal-oriented tasks, but the capacities of these star regions are not limitless. When we multitask, the star regions are likely to be bombarded by competing streams of information with multiple sources of relevant and irrelevant signals, which could lead to congestion. On the other hand, with the star structure theory, the brain is efficiently organized and robust to transmission failure. The star topology reduces the impact of a transmission failure by independently connecting each peripheral region to the star clique. Peripheral regions communicate with each other via transmission to and from the star. Loss of links between peripheral regions has no impact on network communication. When there is a failure of transmission between a peripheral region and the star, the peripheral region is isolated, yet communication in the networked brain is unaffected, making it robust to failures.
Due to the efficiency of brain network communication and its general robustness to transmission failure, degeneration of the functional brain network is damaging when the star regions are compromised. Our results provide evidence for this hypothesis. Internalizing psychopathology in children is attributable to star regions and core cliques of the functional organization (Extended Data Fig. 2). Coherent starlike internalizing functional architectures are concentrated in the motor, limbic, medial–frontal, basal ganglia, default node network and visual I functional systems. By comparison, the internalizing functional networks identified through CPM do not exhibit a coherent pattern nor do they follow central–peripheral differentiation. Our results show that individual differences in the coordinating functional activities of a few star regions can explain substantial individual differences in psychopathology. Thus, the malefactions of star regions could have major impacts on the development of psychopathology and its further deterioration.
Discussion
In this study, we developed a network science-driven analytic method that addresses the lack of power and inflated type II errors in neuroimaging biomarker detection. The proposed method represents an effort to extend SNA^1^ to jointly model brain connectomes and outcome measurements, enhancing the ability to detect region-specific imaging biomarkers. While the current SNA methods mentioned above primarily focus on modeling single networks, brain connectivity networks can be viewed as multiplex networks with multiple layers of brain connections observed across a shared set of brain regions. To model these multiplex structures, a shared set of latent variables across layers can be used, assuming a joint relational structure across sets of connectivity^57^. Alternatively, we can distinguish between shared and individual components across layers^58^. In contrast to these approaches, LatentSNA captures individual differences in brain connectivity networks across layers and identifies specific brain regions where the covariation between layers of brain networks and outcome variables is substantial.
LatentSNA contributes to current neuroimaging connectivity methods by offering a high-power whole-brain approach for identifying brain–behavior links. A critical challenge of current neuroimaging connectivity methods is that connectivity edges are treated as independent observations, resulting in low statistical power and inflated type II errors. Univariate and marginal association analyses independently calculate associations between each connectivity edge and outcomes to identify significant links^4^. CPM^5^ identifies imaging biomarker detection by vectorizing unique pairwise edges from symmetric functional connectomes for behavior prediction.
LatentSNA makes a contribution to existing neuroimaging regression methods such as network response regression^59^ and scalar-on-network regression^60^. LatentSNA offers several advantages over network response and scalar-on-network regressions by positing a shared data generation process for connectivity and outcomes. First, unlike regression models that typically assume one-directional relationships between brain and behaviors or outcomes, estimating either the impact of brain on behavior or vice versa, LatentSNA acknowledges the mutual relationship between them. Changes in the brain often correlate with changes in behavior, but neuroplasticity suggests that disordered behaviors and dysfunctional environments can also influence brain function over time. Second, in scalar-on-network and network response regressions, using brain connectivity (or behavior outcomes) as predictors assumes that these variables are fully observed. This assumption becomes problematic when data include partially missing observations for brain connectivity and individual outcomes. Regression methods struggle to handle situations in which data are incomplete for both brain connectivity and outcomes. Lastly, traditional regression methods lack robustness in estimating parameters related to brain connectivity or behavior when they do not simultaneously model the reciprocal influence between them. By contrast, LatentSNA integrates both brain and behavior within a unified modeling framework, allowing mutual information exchange during model estimation.
The LatentSNA model has limitations that prompt important future extensions. First, with LatentSNA, researchers can obtain satisfactorily accurate predictions of both connectivity and behavioral variants in cross-section settings. Accurate prediction is achieved by incorporating latent variables to separate signal from noise, using joint modeling frameworks and allowing information communication between behavior and connectivity during model estimation. With the increased availability of longitudinal datasets such as ADNI and ABCD, it is of importance to extend current LatentSNA to longitudinal data. Longitudinal extensions would allow us to explore the temporal dynamics of fMRI across developmental or aging stages.
Second, LatentSNA offers substantially improved interpretability of neuroimaging studies, as it provides inferences about specific neuroimaging connectivity features that contribute to behavior outcomes. Future research is needed to investigate the clinical relevance of LatentSNA by exploring the specific contributions of different neuroimaging modalities in behavior predictions and investigating how these features can translate to clinical applications that ultimately improve the practical value of LatentSNA. In particular, a more clinically heterogeneous cohort is needed to understand functional substrates of psychopathologies. The ABCD study offers an opportunity to explore brain–behavior relationships in a large population of children. Yet, at these ages (9–10 at baseline and 11–12 at ABCD-2), relatively few children exhibit depression-related symptoms. Minimal participants in the ABCD study are diagnosed with depression, which limits the psychopathology findings. A more clinically heterogeneous child cohort is thus needed to explore psychopathologies in children.
Future work should also consider the group structure among the regions and how regions collectively contribute to internalizing psychopathology: past work has documented the importance of group structures of the functional brain via functional systems in cognition and disease. Beyond neuroscience, LatentSNA allows the detection of dependence between complex networks and nodal attributes, with potential applications in many other domains of science. Many complex systems such as social relationships, worldwide webs and transportation grids are impacted by higher-level attributes, and LatentSNA is a statistical technique that can open up many fields with rigorous and powerful analysis.
Methods
In Supplementary Table 1, we provide an overview of the study cohorts and datasets included in our analysis, consisting of the following studies: ADNI Grand Opportunities and ADNI-GO/2, ADNI Phase 3, A4, HCP-A, ABCD-B and its 2-year follow-up (ABCD-2) and the transdiagnostic data collected at Yale. We fitted the model to each combination of imaging modality and outcome measure. Our focus includes cognition outcomes, commonly used to assess the performance of new methods; and emotion outcomes, closely aligned with internalizing outcomes such as depression and anxiety; as well as disorder and focal tau PET SUVR outcomes, which directly reflect biological changes in the brain.
Adolescent Brain Cognitive Development Study
We used brain imaging data from both the first and second releases of the largest long-term study of brain development and child health in the US, gathered from 11,875 children aged between 9 and 10 years old^22^. Here, we describe the data processing of the first-release data, and the second release was processed using the same procedures.
Functional magnetic resonance imaging
To investigate links, blood oxygen-level-dependent (BOLD) functional activation was recorded for children during RS and while they performed three emotional and cognitive tasks. The fMRI data underwent initial preprocessing using BioImage Suite^62^. Standard preprocessing procedures, including slice time and motion correction, and registration to the MNI template, were described in detail by Greene et al.^63^ and Horien et al.^64^. Eligible scans exhibited no more than a mean frame-to-frame displacement of 0.10 mm. Brain images were parceled into 268 regions of interest (ROIs) or nodes using the Shen atlas, encompassing the cortex, subcortex and cerebellum^65^. Within each node, voxel-level time courses were aggregated. Functional connectivity was then constructed for each child in the study during both RS and each task state. Functional connectivity matrices were created, with each row and column representing all nodes, and each entry (i, j) in the matrix denoting the Pearson correlation coefficient between the ith and jth nodes, scaled to be normally distributed via Fisher’s z transformation.
To investigate whether a shared set of neural substrates exists for internalizing psychopathology across different emotional and cognitive tasks and to determine whether these substrates differ from those observed during rest, we separately applied LatentSNA to RS functional connectivity and functional connectivity during each task state. Our analysis included 7,606 adolescents with RS functional connectivity data, capturing intrinsic brain functional activity. Additionally, we investigated the functional connectivity of 4,871 adolescents performing the EN-back task, 5,096 adolescents performing the SST and 5,298 adolescents performing the MID task.
Internalizing psychopathology
In the ABCD study, internalizing psychopathology is assessed through self-reported surveys using the Child Behavior Checklist (Stavropoulos et al.^43^), which comprises 119 items aggregated into eight empirical subscales. Three subscales of the Child Behavior Checklist, namely anxious–depressed (13 items), withdrawn–depressed (eight items) and somatic complaints (11 items), contribute to the assessment of internalizing psychopathology. We applied the proposed LatentSNA to both multivariate and univariate representations, interpreting the results based on the model with superior fit, namely the multivariate internalizing measures.
Alzheimer’s Disease Neuroimaging Initiative
Data used in the preparation of this article were obtained from the ADNI database (http://adni.loni.usc.edu).
Structural magnetic resonance imaging and diffusion tensor imaging
We downloaded T1-weighted sMRI and DTI data from the ADNI-GO/2 database from 174 participants. We applied an overcomplete local principal-component analysis^66^ to process DTI data following standard steps including denoising, motion correction and distortion correction. We performed probabilistic white matter fiber tractography using fiber assignment by continuous tracking^67^. We registered sMRI scans to the lower-resolution b0 volume of the DTI data using the FLIRT toolbox in the FMRIB Software Library^68^, and we then defined cortical ROIs in FreeSurfer space using the Lausanne 2008 parcellation with 68 cortical ROIs^69^. We obtained the number of the fibers connecting each pair of ROIs as well as the surface area of the regions. Fiber density-based structural connectivity was calculated by dividing the number of fibers between two ROIs with their average surface areas^70^. Three types of structural brain networks were constructed as the number of fibers between a pair of brain regions, the length of the fibers as well as the fiber density of tracts connecting pairs of ROIs.
Functional magnetic resonance imaging
We used RS functional neuroimaging data from the third release of the ADNI study. We processed the images using the Connectome Mapper 3 pipeline^71^ built in Nipype^72^. RS fMRI images were processed with despiking and slice timing correction following the method of Cox^73^; the images were also motion corrected and distortion corrected using FSL. RS fMRI images were registered to the b0 sMRI using the FLIRT toolbox^74^. The BOLD time signals of each ROI were bandpass filtered and then detrended using a linear regression. We constructed functional brain networks for each participant as the Pearson correlation between the BOLD time signals for pairs of ROIs.
Disorder–cognition outcomes
For outcomes, we included the ADAS, Cognitive Subscale^75^, a rating of dysfunction by AD. We included the ADAS score as the sum of 13 diagnostic questions collected at baseline. In addition, we included the sum score of the Everyday Cognition Scale^76^, a questionnaire measuring the patient’s cognitive function. We applied the proposed LatentSNA to both the ADAS and the Everyday Cognition Scale to assess the model’s generalizability to alternative outcomes for the aging population.
Anti-Amyloid Treatment in Asymptomatic Alzheimer’s Disease
The A4 study is a secondary prevention trial targeted toward older people with amyloid accumulation and at high risk for AD dementia^20^.
Structural and functional magnetic resonance imaging
For the fMRI data, we used the same processing procedure as that for the ABCD. MPRAGE scans were skull stripped using optiBET^77^ and nonlinearly aligned to the MNI-152 template using BioImage Suite.
Focal tau PET SUVR metrics
We used PETSurfer within FreeSurfer for an integrated MRI–PET analysis^78^. We derived focal tau PET ([^18^F]flortaucipir) SUVR metrics from the A4 images using 90–110-min (4 × 5-min frames) post-injection images, preprocessed and analyzed using PETSurfer in FreeSurfer (version 6.0+). We summed and motion corrected the 5-min tau PET frames. We then aligned the composite PET images to corresponding MRI images, parcellated using the Desikan–Killiany Atlas^79^ and partial-volume corrected using FreeSurfer. We gathered the average tracer absorption values for each region defined by the atlas and computed SUVRs using the whole cerebellar cortex as the reference region.
Cognition outcome
To assess cognition changes, we included the Preclinical Alzheimer’s Cognitive Composite (PACC, Donohue et al.^80^) collected as part of the A4 project. PACC is a composite cognitive score combining tests that assess episodic memory, executive function and general cognition, and it is the primary outcome measure for A4 targeting the preclinical AD population. PACC is found to be sensitive to the earliest disease-related changes^81^.
Human Connectome Project in Aging
The Lifespan HCP-A aims to characterize how brain organization and connectivity change during typical aging, compared to an ’abnormal’ aging process^21^.
Functional magnetic resonance imaging
For the fMRI data, we used the same processing procedure as that for the ABCD.
Emotion–cognition outcomes
For the HCP-A project, we focused on cognition and emotion measures. To assess the cognitive capability of the healthy aging population, we included the composite scores for the Picture Sequence Memory Test as well as the Cognition Composite score including Fluid Composite and Crystallized Composite, derived from all National Institutes of Health (NIH) Toolbox Cognition tasks^82^. For the emotion outcomes, we chose Emotional Distress Depression and PROMIS Anxiety to maintain relative consistency with the internalizing outcome. Emotional Distress Depression is captured by the Sadness Survey from NIH Toolbox Emotion Battery^83^, which measures negative mood and perceptions. PROMIS Anxiety is captured by the Fear Affect Survey, a self-report measure assessing fear and anxious misery from NIH Toolbox Emotion Battery.
Transdiagnostic project
The Transdiagnostic project aims to recruit clinically naturalistic and demographically diverse participants to more effectively study the links between imaging and behaviors^23^; the project was conducted at Yale between February 2018 and March 2021. Participants in the Transdiagnostic project tended to show a wide range of symptom severity and commonly had multiple psychiatric diagnoses. All imaging information was collected at the Yale Magnetic Resonance Research Center.
Functional magnetic resonance imaging
Preprocessing of fMRI data from the Transdiagnostic project is the same as the processing of fMRI data from the ABCD study.
Disorder outcomes
We included the global severity index of the Brief Symptom Inventory^84,85^, a rating scale aiming to identify clinically relevant psychological symptoms in adolescents and adults. The global indices measure the level of symptomatology, its intensity and number of occurrences.
LatentSNA
Our method makes use of techniques of Bayesian statistical inference, in which we propose a generative network model to theorize how neuroimaging connectivity and individual behaviors and outcomes intertwine with each other under random statistical processes with noise. We fitted the neuroimaging connectivity data and accompanying outcome measures and estimated covariances between the connectivity of each brain region with outcome measures across participants.
LatentSNA is motivated by the need to improve the power for detecting meaningful biomarkers of individual behaviors and outcomes using noisy imaging connectivity networks. To achieve this aim, we propose LatentSNA with a few distinctive features. First, LatentSNA is a joint model integrating imaging connectivity and behavior variants. Consider a symmetric connectivity tensor, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal{X}}\in {{\mathbb{R}}}^{V\times V\times N}$$\end{document} , where V is the number of nodes for the brain atlas and N is the number of participants. Simultaneously, we have information about the behavior of the participants, denoted by the N × P matrix Y, where each row includes the response value for participant i with p outcome measurements. The proposed LatentSNA is distinct from a network response regression, where the network is the response and the effect of behavior on the network is estimated as the regression coefficient of covariates. Similarly, the model differs from a connectivity-based predictive model with behavior as the response and the network as the predictor^28^. Instead, we proposed a joint data generation process that allows connectivity alternations to inform behavior variations and vice versa: both brain connectivity and behavior are the targeted modeling interests.
Second, LatentSNA has roots in statistical network methods and preserves the topological structure of the network. When modeling brain connectivity (one of the three components of the model), we made use of the symmetric bilinear interaction effect to capture third-order dependence patterns (transitivity, balance and clusterability) often present in symmetric networks^38,86^. While additive effects only capture variations across the rows and the columns of the network (variation in node degrees), bilinear interaction effects capture triangular structures of the network and relatedness among multiple brain regions. This is important because these higher-order dependencies exist in brain connectivity. For example, functional systems capture the coactivation of three or more brain regions that creates behavior, cognition and psychopathology. Bilinear effects capture how the distributed patterns of interactions create function and account for the complexity of integrated multimodal brain systems not possible with additive effects. For each participant, we introduced unidimensional region-specific latent variables zu,i to represent connectivity information for participant i and region u and use zu,izv,i as the driver of connection between brain regions u and v for participant i. Each node u is part of a dependent network with strength of connection to node v via the bilinear effect of the two nodes. Specifically, the connectivity between nodes u and v, u < v, u, v = 1, 2,…, V is modeled by
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${x}_{u,v,i}={{\it{w}}}_{i}^{T}{\it{\beta }}+{a}_{i}+{z}_{u,i}{z}_{v,i}+{e}_{u,v,i},\qquad {e}_{u,v,i} \stackrel{\mathrm{iid}}\sim {\mathrm{N}}(0,{\sigma }^{2}),$$\end{document}where ai is the fixed connectivity intercept for participant i, eu,v,i is the error term, σ^2^ is the error variance and iid stands for independent and identically distributed. We adjusted for Q covariates, for example, age and gender, denoted by wi with the first element to be 1 corresponding to the intercept with their effects on the connectivity matrix characterized by β. Given that each connectivity value is standardized across persons, node-level additive effects are not necessary. The mean of the connectivity values for each node across persons is zero. In matrix form, we used Z to denote the N × V matrix of latent variable values, zi to denote the V × 1 vector of latent variable values for participant i and Ei to denote the V × V matrix of errors. The approximation of the posterior distributions of the unknown quantities is facilitated by setting an MVN(μβ, Σβ), μβ = (0, 0,…, 0, 0)^T^, Σβ = IQ prior distribution for β, a gamma(½, ½) prior distribution for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\sigma }_{e}^{-2}$$\end{document} and an N(0, 1) prior distribution for ai (where MVN stands for multivariate normal and N for normal). The prior for the covariance of the latent network dimensions is described in the joint component.
The third distinguishing feature of LatentSNA is that it focuses on the inference of relationships between connectivity and behaviors. For each participant i, the probability of pairwise brain connectivity also depends on the participant’s behavior yi, and this influence is achieved via joint multivariate normal distribution of the connectivity and behavior parameters. Suppose that we have θi, the unidimensional random latent variable representing the behavior information for participant i. The connectivity and individual behaviors and outcomes are integrated in the following way:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${({z}_{1,i},{z}_{2,i},\ldots,{z}_{V,i},{\theta }_{i})}^{T} \stackrel{\mathrm{iid}}{\sim} \,\mathrm{MVN}\,\left(\left(\begin{array}{l}{{\it{0}}}_{V}\\ {{\it{0}}}_{D}\end{array}\right),{{{\Sigma }}}_{V+D}\right),\qquad {{\Sigma }}=\left(\begin{array}{ll}{{\it{\Lambda }}}_{z}&{{\it{\Lambda }}}_{z\theta }^{T}\\ {{\it{\Lambda }}}_{z\theta }&{{\it{\Lambda }}}_{\theta }\end{array}\right),$$\end{document}where Λz**θ is the V × D matrix modeling the relationship between connectivity and behaviors, D = 1. When there are nonzero elements in the Λz**θ matrix, the connectivity and the attributes regulate and inform each other, which leads to better estimation for both connectivity and behaviors. Approximation of the posterior distribution of Σ^−1^ is facilitated by setting a prior distribution of Wishart(IV+D, V + D + 2). To infer whether the connectivity of a brain region is related to behaviors, we tested whether the corresponding covariance parameter equals zero, controlling for reflection indeterminacy. We delved deeper into the issue of reflection indeterminacy when discussing estimation. Via the joint distribution, we assume that there is a latent dependence structure between the network and the behavior, ΣV+D. This dependence structure is region specific, with behavior having significant links with some brain regions and not others. This dependence structure captures the true (in a statistical sense) covariation between connectivity and behaviors across individuals, separate from variations due to random noise. If a covariance parameter is significantly different from zero, we can conclude that the associated brain region is significantly linked with behaviors, and its differences across individuals can explain individual differences in behaviors.
Last but not least, using latent behavior variables, LatentSNA allows multivariate modeling of individual behaviors and outcomes with more information to improve its estimation precision than univariate modeling. In this manner, observed individual outcomes are generated following a modified version of a psychometric Rasch model. The original Rasch model^61^ proposes a data generation process for random test responses in which each test question has a unique difficulty parameter and each person is ranked based on the number of correct responses. We modified this model in a few ways. The original Rasch model does poorly at accommodating data types that are not binary. We included a more flexible linking mechanism for the latent responses and the observed data, allowing for both discrete and continuous data distributions. The original Rasch model also does not account for covariate effects such gender and race, and, to improve, we included a covariate term that allows the probability of responses to vary depending on participant demographics. Most importantly, we introduced a dependence between the latent behavior variables and connectivity, which allows the latent space of behaviors to be informed by brain connectivity. The degree of dependence is learned via data, and it organically influences how much the behavior information is integrated. As the behavior component of the joint model, participant i’s response on variable p is modeled by
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${y}_{i,p}={{\it{h}}}_{i}^{T}{\it{\gamma }}+{b}_{p}+{\theta }_{i}+{\epsilon }_{i,p},\qquad {\epsilon }_{i,p} \stackrel{\mathrm{iid}}{\sim}{\mathrm{N}}(0,{\tau }^{2}),$$\end{document}where bp is the fixed intercept for variable p. We adjusted for Q′ covariates, for example, age and gender, denoted by hi with the first element to be 1 corresponding to the intercept with their effects on the connectivity matrix characterized by γ. In matrix notation, we used b to denote the P × 1 vector of the intercepts, θ to denote the N × D matrix of latent variables and Ψ to denote the N × P matrix of psychopathology errors. As is common in Rasch models, the parameters for the question items are fixed and the person variables are random. Approximation of the posterior distribution of the intercept parameters is facilitated by setting a standard normal prior distribution. We set a prior distribution of gamma(½, ½) for τ^−2^.
Estimation
Fitting the model involves iterative samples of the full conditional distributions of each parameter defined in the model until we find stable and converged Markov chains to approximate various quantities of the targeted posterior distributions via the Gibbs sampler. To achieve the global optimum for parameter estimation, we start with ten random initializations for parameter values and choose the most optimal results based on out-of-sample prediction accuracy. We iterated the following steps:
- simulate β, a from their full conditional distributions,
- simulate σ^2^ given β, a, τ^2^, γ, b, Z, θ, Σ, X, Y,
- simulate γ, b from their full conditional distributions,
- simulate τ^2^ given β, a, σ^2^, γ, b, Z, θ, Σ, X, Y,
- simulate {Z and θ} from their full conditional distributions and
- simulate Σ from its full conditional distribution. To allow the information in connectivity and individual behaviors and outcomes to flow between each other and mutually inform parameter estimation, we sampled {Z and θ} from their joint full conditional distribution given both the connectivity and behaviors. For participant i, the joint full conditional distribution of zi and θi is the product of the three parts (connectivity, behaviors and joint):
where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\it{T}}={\it{Y}}-{1}{{\it{b}}}^{T}-{\it{H}}{\it{\gamma }}{{1}}_{P}^{T}$$\end{document} and Fi is \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\it{X}}}_{i}-{a}_{i}-{{\it{w}}}_{i}{\it{\beta }}={{\it{z}}}_{i}{{\it{z}}}_{i}^{T}+{{\it{E}}}_{i}.$$\end{document} We can transform Fi in such a way that the transformed error term is a standard normal distribution using \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\tilde{{\it{F}}}}_{i}=c{{\it{F}}}_{i}$$\end{document} , where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c={\sigma }_{e}^{-1}$$\end{document} . Therefore, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\tilde{{\it{F}}}}_{i}=c{{\it{z}}}_{i}{{\it{z}}}_{i}^{T}+{\tilde{{\it{E}}}}_{i},$$\end{document} where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\tilde{e}}_{u,v,i}$$\end{document} follows a standard normal distribution. The joint part of the distribution \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p((\begin{array}{l}{z}_{u,i}\\ {\theta }_{i}\end{array})| {\it{\Sigma }}{\prime} )$$\end{document} can be written as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\exp (-\frac{1}{2}({z}_{u,i}{Q}_{z}^{{\prime} }{z}_{u,i}+{z}_{u,i}{Q}_{\theta z}^{{\prime} }{\theta }_{i}+{\theta }_{i}{Q}_{z\theta }^{{\prime} }{z}_{u,i}+{\theta }_{i}^{T}{Q}_{\theta }^{{\prime} }{\theta }_{i}))$$\end{document} , where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\it{\Sigma }}}^{-1}=(\begin{array}{rc}{Q}_{z}&{Q}_{\theta z}\\ {Q}_{z\theta }&{Q}_{\theta }\end{array})$$\end{document} (each component is a function of Λs) and Σ′ is part of Σ only involving the specific brain region. Extracting relevant terms from \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p((\begin{array}{l}{{\it{z}}}_{i}\\ {\theta }_{i}\end{array})| {{\it{t}}}_{i},{\tilde{{\it{f}}}}_{u,i},{\it{\Sigma }},{\sigma }_{\epsilon }^{2})$$\end{document} , we can see that the full conditional distribution of zu,i is
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{array}{l}p\left({z}_{u,i}| {\tilde{{\it{f}}}}_{u,i},{\it{\Sigma }},{\theta }_{i}\right)\\\propto \exp \left(-\displaystyle\frac{1}{2}{z}_{u,i}\left(\mathop{\sum }\limits_{v = 1,v\ne u}^{V}{c}^{2}{z}_{v,i}{z}_{v,i}+Q{\prime}\right){z}_{u,i}\right.\\+{z}_{u,i}^{T}\left(\mathop{\sum }\limits_{v = 1,v\ne u}^{V}c{\tilde{f}}_{u,v,i}{z}_{v,i}-\displaystyle\frac{1}{2}{Q}_{\theta z}^{{\prime} }{\theta }_{i}-\frac{1}{2}{Q}_{zy}^{{\prime} T}\left.{\theta }_{i}\right)\right),\end{array}$$\end{document}a multivariate normal distribution, with variance \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\left(\mathop{\sum }\nolimits_{v = 1,v\ne u}^{V}{c}^{2}{z}_{v,i}{z}_{v,i}\right.}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\left.+Q{{\prime}}_{z}\right)}^{-1}$$\end{document} and mean \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${(\mathop{\sum }\nolimits_{v = 1,v\ne u}^{V}{c}^{2}{z}_{v,i}{z}_{v,i}+Q{{\prime} }_{z})}^{-1}(\mathop{\sum }\nolimits_{v = 1,v\ne u}^{V}c{\tilde{f}}_{u,v,i}{z}_{v,i}-\frac{1}{2}{Q}_{\theta z}^{{\prime} }{\theta }_{i}-\frac{1}{2}{Q}_{zy}^{{\prime} T}{\theta }_{i})$$\end{document} . The latent variable value for psychopathology is informed by brain connectivity and should be sampled from
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{array}{l}p\left({\theta }_{i}| {{\it{t}}}_{i},{\it{\Sigma }},{z}_{u,i},{\it{A}},{\sigma }_{\epsilon }^{2}\right)\\\propto \exp \left(-\displaystyle\frac{1}{2}{\theta }_{i}^{T}({\sigma }_{\epsilon }^{-2}\mathop{\sum }\limits_{p = 1}^{P}{{\it{\alpha }}}_{p}{{\it{\alpha }}}_{p}^{T}+{Q}_{\theta }){\theta }_{i}\right.\\\left.+\,{\theta }_{i}^{T}\left(\mathop{\sum }\limits_{p = 1}^{P}{\sigma }_{\epsilon }^{-2}{t}_{i,p}{{\it{\alpha }}}_{p}-\frac{1}{2}{Q}_{\theta z}^{T}{{\it{z}}}_{i}-\displaystyle\frac{1}{2}{Q}_{z\theta }{{\it{z}}}_{i}\right)\right),\end{array}$$\end{document}a multivariate normal distribution, with variance \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${(\mathop{\sum}\nolimits_{p = 1}^{P}{\sigma }_{\epsilon }^{-2}{{\it{\alpha }}}_{p}{{\it{\alpha }}}_{p}^{T}+{Q}_{\theta })}^{-1}$$\end{document} and mean \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${(\mathop{\sum }\nolimits_{p = 1}^{P}{\sigma }_{\epsilon }^{-2}{{\it{\alpha }}}_{p}{{\it{\alpha }}}_{p}^{T}+{Q}_{\theta })}^{-1}(\mathop{\sum }\nolimits_{p = 1}^{P}{t}_{i,p}{\sigma }_{\epsilon }^{-2}{{\it{\alpha }}}_{p}-\frac{1}{2}{Q}_{\theta z}^{T}{{\it{z}}}_{i}-\frac{1}{2}{Q}_{z\theta }{{\it{z}}}_{i})$$\end{document} . Crucially, we sampled the covariance matrix Σ from an inverse Wishart (IW) (IV+D + F′^T^F′, N + V + D + 2) with F′ as an N × (V + 1) matrix with the ith row as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$({{\it{z}}}_{i}^{T},{\theta }_{i}^{T})$$\end{document} .
The introduction of the bilinear effect zu,izv,i induces partial reflection indeterminacy. For each set of latent variable values, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{z}}_{u,i}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{z}}_{v,i}$$\end{document} , the positions given by \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-{\hat{z}}_{u,i}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-{\hat{z}}_{v,i}$$\end{document} give the same set of product and consequently the same likelihood. During the MCMC chain, the sign of zu,i, u = 1 can change while maintaining the same connectivity value. Crucially, the connectivity latent variables are also related to individual behaviors and outcomes, whether zu,i is estimated as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{z}}_{u,i}$$\end{document} or \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-{\hat{z}}_{u,i}$$\end{document} has consequences on the correlation between zu,i and θi. Put in a different way, zu,i is softly identified, as the signs of zu,i need to satisfy the correlation between zu,i and θi. To estimate such a model, we assume that, after a sufficient burn-in period, the signs of zu,i have reached a sufficiently optimal point, where its correlation with θi has researched a stabilized estimate resembling the true correlation. After this burn-in period, we fix the signs of zu,i to the same as those of the target, that is, target = estimated zu,i from the first iteration after burn in. Therefore, there is no reflection indeterminacy issue after burn in.
The estimation algorithm for this paper was implemented in R. The code is available via the user-friendly GitHub page at https://github.com/selenashuowang/latentSNA with a tutorial. For each task condition, we performed posterior inference based on the MCMC algorithm under random initialization. No obvious nonconvergence issues were found via trace plots. For each task condition of the ABCD study, we compared the model fit of the multivariate behaviors with that of the univariate behavior outcome. The univariate outcome is the sum of the three internalizing variables as mentioned before.
Identification of imaging biomarkers is based on whether the estimated covariances between connectivity and behavior are significantly different from zero. Therefore, it is of interest to expand on the sensitivity of the prior specification of the covariance parameters.
In LatentSNA, the approximation of the posterior distribution of Σ is facilitated by setting a prior distribution of IW(IV+1, V + 1 + 2) with the identify scale matrix S0 = I and degree of freedom equal to m0 = V + 1 + 2. The use of an IW distribution as a prior for the variance–covariance parameter matrix is fairly common in Bayesian analysis; see discussions of Leonard and Hsu^87^. The IW prior is a conjugate prior for the covariance matrix of the normal data. In LatentSNA, we are interested in estimating the covariance matrix Σ of the joint distribution of the latent connectivity and behavior variables, D = (z1,i, z2,i,…, zV,i, θi). With the IW prior, the posterior distribution of Σ can be obtained through Bayes’ theorem:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{array}{r}p({\it{\Sigma }}| {\it{D}})=\displaystyle\frac{p({\it{D}}| {\it{\Sigma }}| )p({\it{\Sigma }})}{p({\it{D}})}.\end{array}$$\end{document}From it, we can obtain the posterior distribution of Σ with the specified prior distribution as
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\it{\Sigma }}| {\it{D}} \sim {\rm{IW}}({S}_{0}+{{\it{F}}}^{{\prime} T}{{\it{F}}}^{{\prime} },{m}_{0}+2),$$\end{document}where F′ is an N × (V + 1) matrix with the ith row as \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$({{\boldsymbol{z}}}_{i}^{T},{\theta }_{i}^{T})$$\end{document} . Therefore, the posterior mean of Σ is a weighted average of the sample covariance matrix F^′T^F′ and the prior mean S0. When the sample size N → ∞, the posterior mean approaches the sample mean.
In a sensitivity analysis conducted by Zhang^88^, the author set the scale matrix as identity and varied the degrees of freedom by increasing m0. With the increase in m0, the posterior means become smaller and the posterior variances also become smaller. Thus, given the large sample size in the data, we expect the posterior mean of Σ to approach the sample mean.
Simulation
The data generation process for the simulation was as follows. For simplicity and consistency, the number of behavior variables was assigned as one in all generated data. We first generated the connectivity latent variables as well as the latent behavior variables from the multivariate normal distribution with the mean zero and the predefined covariance matrix with unit variances. To conduct a comprehensive assessment of the model performance, we created a range of data situations with varying sample sizes, connectivity scale, signal-to-noise ratio and signal proportions. To assess the model’s ability to accurately identify true imaging biomarkers for outcomes that have both strong and weak biological signals, we varied the amount of true signals in the data by assigning the signal proportion to 0.1 and 0.3. When the signal proportion equaled 0.1 (0.3), we randomly assigned 10% (30%) of the covariance parameters between connectivity and behavior to be nonzero. To ensure the positive definiteness of Σ, we assigned both the covariances between connectivity and behavior and the corresponding dimensions in the latent connectivity covariance matrix as 0.9. We randomly sampled the errors for the connectivity from a normal distribution with mean 0 and variance defined by the signal-to-noise ratio. Errors for the behavior were sampled from the normal distribution with the mean 0 and variance 0.5.
We considered three sample sizes, N = 500, N = 1,000 and N = 2,000 and two conditions for the number of nodes V = 20 and V = 70, and we specified two levels of the signal-to-noise ratio, 0.5 and 1, controlled by the error variance while keeping the variance of the latent variables constant. The individual-specific intercepts for connectivity and behavior were set to 0. In total, we considered 24 different scenarios combining from different signal proportions, sample sizes, node numbers and signal-to-noise ratios. Under each scenario, we simulated the 100 data.
We compared LatentSNA with CPM, Lasso and CCA. For Lasso, we fitted the model to the training set using the glmnet package^89^. We selected significant edges based on minimizing mean squared error with tenfold cross-validation. For CCA, we fitted the model to the training set using the CCA package^90^, and regions with strong loadings were considered to be related to behavior. The cutoff thresholds are determined by the true signal proportions. For example, when the true signal proportion equals 0.1, we considered the top 10% of regions with highest absolute loadings to be significantly linked with behavior.
Predicting outcomes
For LatentSNA (θ), predicting the behavior outcome of a new participant amounts to additional draws for each new yi from a distribution with probability determined by the model. For LatentSNA (Z), on the other hand, predicting the behavior outcome of a new participant is based on the estimated latent connectivity variable Z from the training data. We evaluated the out-of-sample predictive performance for LatentSNA (Z) and LatentSNA (θ) as follows:
- We randomly sampled 100 participants and their behavior outcome as the test data and the other sets of data points as the training data.
- We fitted the training data to LatentSNA and obtained the posterior mean of the model parameters.
- For LatentSNA (θ),
- Predicting the behavior outcome of a new participant amounts to additional draws for each new yi from a distribution with probability determined by the model.
- The full conditional of the new observations \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\mathcal{Y}}}^{(\rm{test})}$$\end{document} is, for any \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${y}_{i}\in {{\mathcal{Y}}}^{(\rm{test})}$$\end{document} , determined by π(yi∣θ, bi, Ψi).
- For LatentSNA (Z),
- Predicting the behavior outcome of a new participant is based on the estimated latent connectivity variable \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{{\it{Z}}}$$\end{document} from the training data.
- We first selected significant imaging biomarkers based on 95% posterior credible intervals of the covariance parameters and used latent connectivity variables of significant imaging biomarkers as predictors.
- Second, we split the estimated latent connectivity variables into the test set \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{{\it{Z}}}}^{(\rm{test})}$$\end{document} and the training set \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{{\it{Z}}}}^{(\rm{train})}$$\end{document} following the split of the data.
- Third, we fitted the training model using \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{{\it{Z}}}}^{(\rm{train})}$$\end{document} as the predictors and the observed psychopathology outcomes for the training participants as the response.
- We obtained the estimated regression coefficients \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\hat{\beta}}$$\end{document} based on the training model.
- Lastly, we predicted the psychopathology outcome of a new participant, for any \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${y}_{i}\in {{\mathcal{Y}}}^{(\rm{test})}$$\end{document} , under LatentSNA (Z) following \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${y}_{i}={\hat{\beta}}\times {\hat{Z}}^{({\rm{test}})}$$\end{document} .
We repeated the process ten times. Figure 3b shows the out-of-sample correlations between the observed and predicted internalizing values on the test data using LatentSNA (Z) and LatentSNA (θ). Between LatentSNA (Z) and LatentSNA (θ), the former does not directly, but indirectly, incorporate training internalizing information for prediction, while the latter does. This shows that, by constructing joint learning mechanisms using LatentSNA, we can effectively predict internalizing profiles for new participants based on the available data.
Comparison methods
We have added model evaluation results against two network-based brain analysis methods, the penalized GC approach^24^ and the TNFA. Additionally, we have incorporated comparisons with three widely used machine learning techniques, SVM^7^, RF^8^ and CNNs^9^, to provide a comprehensive assessment of our methods’ performance. The GC approach uses brain connectivity as predictors and adopts both L1 penalty, the absolute value of coefficient magnitudes and a generic group Lasso penalty. We fitted the GC approach using the graphclass R package^91^. The tuning of the penalty factor pair (λ, ρ) was conducted on a 3 × 4 grid, with λ selected from the set {10^−6^, 10^−5^, 10^−4^} and ρ ∈ {1, 10, 20, 30}. It was observed that a λ value exceeding 10^−3^ and a ρ value surpassing 40 result in the penalization of all coefficients to zero.
For the TNFA approach, similar to the tensor network principal-component analysis method^25^, we embedded the V × V symmetric adjacency matrices into a low-dimensional matrix; each row contains participants’ principal-component scores, and each column contains the basis network; only significant basis networks were included as predictors. We then performed a network predictor regression with the embedded low-dimensional basis networks as predictors of the outcome variables. SVM predicted behavioral outcomes based on a low-dimensional matrix derived from the V × V symmetric adjacency matrices, akin to the TNFA approach. This process involves embedding the adjacency matrices into a reduced space, where only significant basis networks were retained as predictors. These features were then used to train an SVM model with a linear kernel using the e1071 R package^92^. The model undergoes parameter tuning using a grid search to optimize the cost parameter, and the best model is used to predict behavioral outcomes from test data. The RF method is implemented using the ranger package within the caret framework in R^93^, and, similarly to SVM, it uses features derived from a low-dimensional matrix of brain connectivity data. A grid search strategy optimizes key parameters: the number of variables per split (mtry), the node splitting criterion (splitrule) and the minimum node size (min.node.size). For CNN, we fitted the model with the torch package in R^94^. Our CNN architecture consists of sequential dense layers with ReLU activations, specifically designed to handle the features extracted from the low-dimensional connectivity data. The model undergoes training using an Adam optimizer and a cross-entropy loss function across multiple epochs, ensuring optimal learning from the training data. After training, the CNN is used to predict outcomes on the test dataset.
Predicting connectivity
We evaluated the out-of-sample performance for predicting connectivity of new participants as follows:
- We randomly sampled 100 participants and their connectivity values as the test data and the other part of data points as the training data.
- We fitted the training data to LatentSNA and obtained the posterior mean of the model parameters.
- Predicting the connectivity of a new participant amounts to additional draws for each missing \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\it{x}}}_{i}\in {{\mathcal{X}}}^{(\rm{test})}$$\end{document} from a distribution with probability determined by the model.
- The full conditional of the new observations \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\mathcal{X}}}^{(\rm{test})}$$\end{document} is, for any \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${x}_{i}\in {{\mathcal{X}}}^{(\rm{test})}$$\end{document} , determined by π(xi∣zi, ai, Ei).
We repeated the process ten times. Figure 3c shows the average out-of-sample correlations between the observed and predicted connectivity values in the test data for predicting the whole graph, the top ten internalizing regions and the top five internalizing regions. The results show that LatentSNA provides sufficiently accurate prediction of the connectivity measurements, posing a unique opportunity to uncover brain connectivity for new participants, incorporating their internalizing measures.
Comparison method
The Average method and its extensions represent one of the most common methods to capture group-level connectivity and to perform subsequent analysis^95^, often with satisfactory prediction accuracy^96^. We first randomly divided the connectivity data into ten equal sizes, using one set of data points as the test data and the other sets of data points as the training data. We then captured the group-level connectivity using the entry-wise sample mean of individual connectivity matrices in the training data. We performed predictions for connectivity in the test set using estimated connectivity from the training data. We show the average out-of-sample correlations between the observed and predicted connectivity values across 100 random samples (Supplementary Fig. 3a). Our results suggest that LatentSNA shows satisfactory prediction accuracy for brain connectivity using individual-level estimates, and it outperforms the Average method when the signal proportion is large. When predicting connectivity using group-level estimates, LatentSNA and the Average method both show satisfactory performance for the whole graph, and LatentSNA outperforms the Average method for regions with strong relational ties with behavior.
Network statistics
Node strength, an extension of degree in weighted networks, is the sum of the edge weights associated with each node^97^. Closeness reflects how quickly one node can reach others. We calculated closeness in the weighted graphs using the igraph R package^98^, and a uniform magnitude equaling the largest negative edge is added to all edges to ensure that all weights are positive. Among the shortest paths in a network that pass through intermediate nodes, betweenness reflects how many times a node is present in those paths and demonstrates the extent to which a node is part of connections among other nodes^99^. We calculated the betweenness of the connectivity networks with positive weights defined as before using the igraph R package^100^. High betweenness reflects power as it positions the region with an important bridging role allowing the neighboring regions to connect^101^, an investment into the communication between distant clusters.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41592-025-02896-9.
Supplementary information
Supplementary InformationSupplementary Figs. 1–8 and Tables 1–3. Reporting Summary Peer Review File
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Albawi, S., Mohammed, T. A. & Al-Zawi, S. Understanding of a convolutional neural network. In 2017 International Conference on Engineering and Technology (ICET) 1–6 (IEEE, 2017).
- 2Friedman, N., Linial, M., Nachman, I. & Pe’er, D. Using Bayesian networks to analyze expression data. In Proc. 4th Annual International Conference on Computational Molecular Biology 127–135 (ACM, 2000).10.1089/10665270075005096111108481 · doi ↗ · pubmed ↗
- 3Sawai, H. Exploring a new small-world network for real-world applications. In Networked Digital Technologies: 4th International Conference, NDT 2012, Dubai, UAE, April 24–26, 2012. Proceedings, Part I 4 (ed. Benlamri, R.) 90–101 (Springer, 2012).
- 4Chen, S. et al. Identifying covariate-related subnetworks for whole-brain connectome analysis. Biostatistics 25, 541–558 (2024).10.1093/biostatistics/kxad 007PMC 1101712737037190 · doi ↗ · pubmed ↗
- 5Wasserman, S. et al. Social Network Analysis: Methods and Applications (Cambridge Univ. Press, 1994).
- 6Gelman, A., Carlin, J. B., Stern, H. S. & Rubin, D. B. Bayesian Data Analysis (Chapman and Hall/CRC, 1995).
- 7D’Agostino, R. B. Transformation to normality of the null distribution of g 1. Biometrika 57, 679–681 (1970).
- 8Madore, K. P. & Wagner, A. D. Multicosts of multitasking. Cerebrum 2019, cer-04-19 (2019).PMC 707549632206165 · pubmed ↗
