Brain functional connectivity, but not neuroanatomy, captures the interrelationship between sex and gender in preadolescents
Athanasia Metoki, Roselyne J. Chauvin, Evan M. Gordon, Timothy O. Laumann, Benjamin P. Kay, Babatunde Adeyemo, Samuel R. Krimmel, Scott Marek, Anxu Wang, Andrew N. Van, Noah J. Baden, Vahdeta Suljic, Kristen M. Scheidter, Julia Monk, Forrest I. Whiting, Nadeshka J. Ramirez-Perez

TL;DR
This study finds that brain connectivity patterns better predict biological sex in preadolescents than brain structure, but neither captures gender alignment.
Contribution
The study shows that functional connectivity, not neuroanatomy, is more effective for predicting sex and gender alignment in preadolescents.
Findings
rsFC predicted sex with 85% accuracy, outperforming cortical thickness and volume.
Brain regions in association and visual networks were most predictive of sex.
rsFC did not predict sex/gender alignment, highlighting a gap in understanding gender's neural basis.
Abstract
Understanding sex differences in the adolescent brain is crucial, as they relate to sex-specific neurological and psychiatric conditions. Predicting sex from adolescent brain data may reveal how these differences influence neurodevelopment. Recently, attention has shifted toward socially-identified gender (distinct from sex assigned at birth) recognizing its explanatory power. This study evaluates whether resting-state functional connectivity (rsFC), cortical thickness, or cortical volume better predicts sex and sex/gender alignment (congruence between sex and gender) in preadolescents. Using Adolescent Brain Cognitive Development (ABCD) Study data and machine learning, rsFC predicted sex more accurately (85 %) than cortical thickness (76 %) and cortical volume (70 %). Brain regions most predictive of sex belonged to association (default mode, dorsal attention, parietal memory) and…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealth, Environment, Cognitive Aging · Stress Responses and Cortisol · Functional Brain Connectivity Studies
Introduction
1
The investigation of sex differences in the human brain has been a research interest for decades (Cosgrove et al., 2007, Giedd et al., 2012, Jäncke, 2018, Kaczkurkin et al., 2018, Lenroot and Giedd, 2010, Luders and Kurth, 2020, Luders and Toga, 2010, Sacher et al., 2013) spanning studies focusing on adults (Choleris et al., 2018, Jäncke, 2018, Kodiweera et al., 2016, Lotze et al., 2019, Ritchie et al., 2018), to those examining children and adolescents (Giedd et al., 2012; Gur and Gur, 2016; Herting and Sowell, 2017; Tunç et al., 2016; Vijayakumar et al., 2018), and even infants (Benavides et al., 2018, Fenske et al., 2023, Gilmore et al., 2007). Understanding brain sex differences is crucial, as many neurological and psychiatric conditions vary by sex in prevalence, onset, and symptomatology (Bao and Swaab, 2010, Baron-Cohen et al., 2011, Paus et al., 2008, Rutter et al., 2003), possibly due to differences in brain structure or function. Adolescence is a critical developmental period marked by increasing differences between males and females in physical traits, behavior, and mental health risks. Identifying sex-specific brain development patterns during this time could illuminate the roots of distinct mental health vulnerabilities, guiding tailored interventions that address the unique needs of each sex.
Diverse neuroimaging modalities have been used to study brain sex differences, including anatomical (e.g., volume, surface, and white matter integrity) and functional (e.g., resting-state functional connectivity; rsFC) measures. Structural variations have been observed between males and females (for a comprehensive review see Kaczkurkin et al., 2018), in brain size (Ball et al., 2012, Bramen et al., 2011, De Bellis et al., 2001, Luders et al., 2005, Ritchie et al., 2018, Ruigrok et al., 2014), cortical and subcortical volume (Allen et al., 2003, Goldstein et al., 2001, Gur et al., 1999, Luders et al., 2005), cortical surface area (Luders et al., 2006, Nopoulos et al., 2000), hemispheric white matter connectivity (Ingalhalikar et al., 2014), and cerebral blood flow (Satterthwaite et al., 2014). Despite these structural distinctions, the existence and extent of functional brain organization sex differences remains unclear. While some studies have pinpointed specific networks where rsFC appears to vary between males and females (Biswal et al., 2010, Ritchie et al., 2018; L. Tian et al., 2011; Zuo et al., 2010), others have reported no discernible impact of sex in resting-state functional magnetic resonance imaging (fMRI) data (Weissman-Fogel et al., 2010). Moreover, evidence shows that sex differences in rsFC networks are affected by the menstrual cycle (Avila-Varela et al., 2024, Weis et al., 2019), indicating that rsFC variation may depend not only on sex but also on underlying factors like hormonal fluctuations.
Most studies on brain sex differences focus on sex as assigned at birth. However, an individual’s socially-identified gender may also influence brain development, potentially in part through societal and cultural perceptions of and reactions to their gender. This, in turn, can contribute to gender behavioral differences independent of sex (Bale and Epperson, 2016, Kim and Nafziger, 2000). To explore whether brain factors play a role in gender identity development, researchers have compared cisgender and transgender adult individuals, with mixed findings. Some studies suggest that the brains of transgender individuals resemble those of their sex assigned at birth (Emory et al., 1991, Luders et al., 2009, Mueller et al., 2017, Yokota et al., 2005), others align them with their gender identity (Mueller et al., 2017, Zhou et al., 1995), while additional research shows an intermediate(Flint et al., 2020; Kranz et al., 2014; Kurth et al., 2022; Mueller, De Cuypere, et al., 2017; Mueller, Landré, et al., 2017; Rametti, Carrillo, Gómez-Gil, Junque, Segovia, et al., 2011; Rametti, Carrillo, Gómez-Gil, Junque, Zubiarre-Elorza, et al., 2011) or unique phenotype (Mueller et al., 2021). Overall, evidence on brain differences related to sex and gender identity remains inconclusive.
Previous research on brain sex differences has often relied on traditional regression methods that treat networks or regions-of-interest (ROI) independently. In contrast, multivariate pattern analysis can reveal more complex patterns without assuming independence, making advanced computational methods better suited for identifying brain sex differences. Machine learning approaches, such as classification, can assess how accurately sex can be predicted from neuroimaging data (Bzdok, 2017), yet few studies have applied these methods to examine sex differences in youth (Adeli et al., 2020, Brennan et al., 2021, Kurth et al., 2020, Sepehrband et al., 2018, Shanmugan et al., 2022). Existing research has typically focused on either brain structure (Adeli et al., 2020, Brennan et al., 2021, Kurth et al., 2020, Sepehrband et al., 2018) or function (Shanmugan et al., 2022), with no study to date comparing the predictive power of neuroanatomy versus functional connectivity in children or adolescents using a single dataset. Additionally, the potential benefits of combining both structural and functional data have not been explored in young populations, and most studies have used small samples spanning broad age ranges (Kurth et al., 2020, Sepehrband et al., 2018, Shanmugan et al., 2022) rather than focusing on larger, age-specific samples. Only one study has examined the neurobiological underpinnings of sex and gender in children (Dhamala et al., 2024) using linear ridge regression, finding that while sex and gender are uniquely represented in brain patterns, gender is less distinctly captured in functional connectivity than sex. However, this study did not explore a key cross-comparison evaluating the extent of shared versus independent predictive power between sex and gender in the same functional connections or networks, leaving unresolved questions about whether these unique representations share predictive power or are entirely independent.
In the present study, we moved beyond previous work by evaluating whether rsFC, cortical thickness or cortical volume is more effective in predicting sex and gender, while also exploring the extent of their interrelationship within the brain of preadolescents. We leveraged neuroimaging data from the Adolescent Brain Cognitive Development® (ABCD; Garavan et al., 2018) Study (n = 3129, 9–10 years old) at baseline, and self- and parent-reported gender data at the one-year follow-up time point. We use the following terminology: “sex” to describe an individual’s sex assigned at birth (i.e., male or female), usually based on physical anatomy and/or chromosomes at birth, and “gender” as the internal sense of oneself as boy, girl, or something else (Potter et al., 2021). We also use the term “sex/gender alignment” to refer to the congruence between an individual's sex and their gender. Our approach involved unimodal (i.e., rsFC, cortical thickness and cortical volume separately) and multimodal (i.e., rsFC/cortical thickness and rsFC/cortical volume combined) support vector machine (SVM) learning neuroimaging analyses to determine the predictability of sex and sex/gender alignment. Additionally, we tested whether a sex classifier trained on individuals with sex/gender alignment would perform comparably when predicting the sex of similarly aligned versus unaligned individuals. This was done to assess whether the concept of sex/gender alignment is distinguishable from sex in the brain. Lastly, we examined the relationship between the rsFC, cortical thickness and cortical volume sex classification scores (brain profile predictions) and sex/gender alignment scores to assess the consistency between the adolescents' brain profiles and their sex/gender alignment.
Methods
2
ABCD study
2.1
The ABCD is a ten-year, longitudinal study of 11,875 youth enrolled at ages 9–10 from 21 sites in the United States. Participants (youth and parents) were recruited through schools, with minimal exclusion criteria (Garavan et al., 2018). The ABCD study obtained centralized institutional review board (IRB) approval from the University of California, San Diego. Each of the 21 sites also obtained local IRB approval. Ethical regulations were followed during data collection and analysis. Parents or caregivers provided written informed consent, and children gave written assent. The racial demographics of the participants roughly match the racial composition of the 2023 American Community Survey (U.S. Census Bureau, 2023). Although the ABCD study utilizes a longitudinal design, the analyses conducted for the current study were cross-sectional.
Participants
2.2
This project used rsFC, cortical thickness, and cortical volume data from n = 10,259 available participants from the ABCD BIDS (Brain Imaging Data Structure) Community Collection (Feczko et al., 2021) (ABCD collection 3165; https://github.com/ABCD-STUDY/nda-abcd-collection-3165), baseline visit (9–10 y). Demographic and behavioral data were obtained from the ABCD 4.0 release, which included data from the one-year follow-up visit, marking the beginning of gender measures’ collection. Following the ABCD consortium's recommendations, we excluded participants scanned with Philips scanners due to incorrect preprocessing (source: https://github.com/ABCD-STUDY/fMRI-cleanup).
Head motion can systematically bias developmental studies (Power et al., 2012, Siegel et al., 2017), as well as those relating rsFC to behavior (Siegel et al., 2017). However, these systematic biases can be addressed through rigorous head motion correction (Power et al., 2014). The inclusion criteria for the current project required participants to have at least 600 frames (equivalent to 8 min) of low-motion (Fair et al., 2020) rsFC data (low-motion defined as having a filtered framewise displacement (FD) of less than 0.08 mm). See Casey et al. (2018) for broader ABCD inclusion criteria. Based on these criteria, and after excluding subjects with missing values in demographic, gender, or behavioral questions, the final ABCD sample consisted of rsFC, cortical thickness, cortical volume, demographic, gender, and behavioral data from a total of n = 3129 youth (Table 1).Table 1. Demographics (n = 3129) and descriptive statistics. SD, standard deviation; PDS, pubertal development scale; FD framewise displacement.Table 1TOTAL (n = 3129)Malesn = 1542; 49.3 %Femalesn = 1587; 50.7 %Mean Age in Months (SD)Range132.4 (7.8)116–148132.3 (7.7)117–149t = 0.53, p = 0.59Sex/Gender AlignmentAligned1153 (74.8 %)860 (54.2 %)χ^2^= 144.40, p < 0.001Unaligned389 (25.2 %)727 (45.8 %)PDSPrepuberty468 (30.3 %)225 (14.2 %)χ^2^= 118.63, p < 0.001Early puberty712 (46.2 %)331 (20.9 %)χ^2^= 225.57, p < 0.001Mid puberty342 (22.2 %)854 (53.8 %)χ^2^= 331.43, p < 0.001Late puberty20 (1.3 %)177 (11.1 %)χ^2^= 128.78, p < 0.001Residual In-Scanner Motion (Mean FD), mm0.13 (0.08)0.03–1.070.12 (0.07)0.03–0.70t = 3.75, p < 0.001
MRI acquisition and image processing methods
2.3
MRI acquisition
2.3.1
Imaging for each ABCD youth was performed across 21 sites within the United States, and was harmonized across Siemens Prisma, Philips, and GE 3 T scanners. Details on image acquisition can be found elsewhere (Casey et al., 2018). Twenty minutes (4 × 5 min runs) of eyes-open (passive crosshair viewing) resting-state fMRI data were collected to ensure at least 5 min of low-motion data. All resting-state fMRI scans were acquired using a gradient-echo EPI sequence (TR = 800 ms, TE = 30 ms, flip angle = 90°, voxel size = 2.4 mm^3^, 60 slices). Head motion was monitored online using the Framewise Integrated Real-time MRI Monitor (FIRMM) software at Siemens sites (Dosenbach et al., 2017).
Image processing overview
2.3.2
The image processing stream has been detailed elsewhere (Feczko et al., 2021). Briefly, the ABCD pipeline comprises six stages: (1) PreFreeSurfer, which normalizes anatomical data; (2) FreeSurfer, which constructs cortical surfaces from the normalized anatomical data; (3) PostFreeSurfer, which converts outputs from FreeSurfer to CIFTIs and transforms the volumes to a standard volume space using ANTs nonlinear registration; (4) “Vol” stage, which performs the atlas transformation, mean field distortion correction, and resampling to 2 mm isotropic voxels in a single step using FSL’s applywarp tool; (5) “Surf” stage, which projects the volumetric functional data onto the surface; and (6) “DCANBOLDproc”, which performs functional connectivity processing.
DCANBOLDproc includes a respiratory filter to improve FD estimates calculated in the “vol” stage. Temporal masks were created to flag motion-contaminated frames using the improved FD estimates (Power et al., 2012). Frames with FD > 0.30 mm were flagged as motion contaminated. After computing the temporal masks for high motion frame censoring, the data were processed with the following steps: (i) demeaning and detrending, (ii) interpolation across censored frames using least squares spectral estimation of the values at censored frames (Power et al., 2014) so that continuous data can be (iii) denoised via a general linear model (GLM) including whole brain, ventricular, and white matter signals, as well as their derivatives. Denoised data were then passed through (iv) a band-pass filter (0.008 Hz < f < 0.10 Hz) without re-introducing nuisance signals (Hallquist et al., 2013) or contaminating frames near high motion frames (Carp, 2013).
Functional connectivity, cortical thickness, and cortical volume metrics
2.3.3
An ROI method was employed, utilizing a total of 394 parcels, which included the 333 cortical parcels initially defined by Gordon et al. (2016) and an additional 61 subcortical parcels described by Seitzman et al. (2020). For each participant, the rsFC timecourse was extracted from these 394 ROIs after discarding frames with a filtered FD > 0.08 mm. The 333 cortical ROIs can be organized into separable brain networks (e.g., default-mode, fronto-parietal, salience, etc.). A correlation matrix was created through the computation of the correlation between the timecourse of each ROI and that of every other ROI, yielding a 394 × 394 correlation matrix for each subject. The correlations were normalized using Fisher's r-to-z transform (Fisher, 1915). The correlation matrices were then combined across all subjects to form a 394x394x3129 matrix, which was employed for subsequent analyses. rsFC analyses were run at the edge level (ROI-ROI pair; n = 77,421). Furthermore, cortical thickness and cortical volume was extracted from the 333 cortical ROIs for each participant. Cortical volume for each ROI was normalized by dividing each ROI’s volume by the subject's intracranial volume to account for individual differences in brain size. Additional cortical thickness analyses at the vertex-level (59,412 cortical vertices for each participant) were also conducted in an effort to capture regional effects that may not align with ROI boundaries (Supplementary Results).
Sex and gender measures
2.4
We used the ABCD's parent-reported data on youths' sex, which reflect their sex assigned at birth. In the ABCD dataset, gender identity is assessed as part of the background items in the Kiddie Schedule of Affective Disorders and Schizophrenia (K-SADS; Kaufman et al., 1997). However, this item has been identified as developmentally inappropriate for young adolescents, as evidenced by two out of five youths aged 9–10 years not understanding what the question is asking (Dube et al., 2021; A. Potter et al., 2021). In addition, the ABCD study assesses gender with the Youth Self-Report and Parent-Report Gender Questionnaires. All participants included in analyses completed the Youth Self-Report Gender Questionnaire which comprised four questions assessing felt-gender, contentedness with assigned sex at birth, and gender expression (e.g., “How much do you feel like a <boy/girl>?” for the felt-gender construct. For the complete list of questions and constructs see Supplementary Table 1). Their parents/caregivers completed an adaptation of the Gender Identity Questionnaire (Elizabeth and Green, 1984, Johnson et al., 2004) which included 12 questions that measure sex-typed behavior during play and gender dysphoria (Potter et al., 2022). All items use a five-point scale with higher scores reflecting more congruence with sex assigned at birth. The current study computed and used the mean scores of both constructs separately, capturing gender dimensionally (Potter et al., 2022). Sex/gender alignment status was either “aligned”, i.e., sex/gender alignment defined as scoring the highest possible achievable score in the Youth Self-Report Gender Questionnaire, or “unaligned”, i.e., sex/gender unalignment defined as scoring anything other than the highest possible score. Participants who provided a “Decline to answer” response to any question were not included in the analyses (see Supplementary Fig. 1 for response frequencies on the Youth Self-Report and Parent-Report Gender Questionnaires).
Support vector machine learning
2.5
Support vector machine classification
2.5.1
Support vector machine (SVM) learning was used to examine the predictability of sex in adolescents based on neuroimaging measures. Five binary linear SVM classifiers were constructed, each using a different set of predictors: rsFC, cortical thickness, cortical volume, rsFC combined with cortical thickness, and rsFC combined with cortical volume. The target labels for all models were the sex categories of male and female. The classifiers were trained exclusively using youth with sex/gender alignment (highest possible attainable score in the Youth Self-Report Gender Questionnaire). The objective of training the models exclusively on participants with the most extreme sex/gender alignment was to determine whether the SVM models, which aimed to predict sex, could effectively capture the concept of sex/gender alignment. If the models show no significant difference in accuracy between predicting sex in aligned and unaligned participants, it would suggest that sex and sex/gender alignment are indistinguishable in rsFC and/or cortical thickness/cortical volume. As a form of validation, we conducted the opposite analyses too, of training our SVM models on unaligned participants and testing on the aligned ones. The aligned group data was split to retain 20 % of subjects held out entirely for testing and 80 % for training (for the validation analyses, this split was applied in the unaligned group). Depending on the analysis, independent testing sets were composed either entirely of youth with sex/gender alignment (aligned group) or entirely of youth with sex/gender unalignment (unaligned group) (Fig. 1). To account for statistically significant sex differences in pubertal development stage and residual in-scanner motion (mean FD) (see 2.6 Other Statistical Analyses, 3. Results, & Table 1), these variables were regressed out separately from each feature in the training, cross-validated, and independent test datasets to prevent data leakage. To ensure robust model evaluation and prevent overfitting, the classification was evaluated by using a nested five-fold cross-validation (5F-CV) (Poldrack et al., 2020) in the training set, with the inner 5F-CV determining the optimal regularization coefficient C via grid search for the SVM binary classifier and the outer 5F-CV estimating the generalizability of the model (Fig. 1). Each feature was linearly scaled between zero and one across the outer 5F-CV training sets; these scaling parameters were then applied to scale the outer 5F-CV and independent testing sets (Cui and Gong, 2018, Erus et al., 2015). For additional SVM parameters see Supplementary Methods.Fig. 1. Flowchart of the support vector machine (SVM) model construction. The dataset was split in two groups: Aligned (participants with sex/gender alignment) and unaligned (participants with sex/gender unalignment). The aligned group was split in an aligned training group (80 %) and an aligned hold-out testing group (20 %). A nested five-fold cross-validation (5F-CV) was employed, with the inner 5F-CV determining the optimal parameter C and the outer 5F-CV estimating the generalizability of the model. The final, optimal model was subsequently applied in the held-out aligned testing group and the unaligned group, and model performance was evaluated. The forward slash in the groupings denotes the different possible group splits (e.g. The performance estimation loop training group was split in three train/test group pairs with 1030/258 subjects and two train/test group pairs with 1031/257 subjects respectively).Fig. 1
Assigning weights to observations (single data points or instances within a dataset) is a valuable approach for addressing the issue of treating all observations equally. The weights indicate the importance or influence of each observation during the SVM model training. Due to an imbalanced distribution of males and females in the aligned group (see 3. Results & Table 1), which would lead to an imbalance in the SVMs’ aligned training groups, inverse proportion instance weights were calculated and assigned to help balance the contribution of each class to the models.
Support vector regression
2.5.2
Linear support vector regression (SVR) was used to evaluate whether sex/gender alignment (a discrete variable derived from the Youth Self-Report and Parent-Report Gender Questionnaires) can be predicted by brain neuroanatomy or functional connectivity. Using the same parameters/choices as in the SVM classification models (Methods and Supplementary Methods), we ran a total of twelve models, utilizing different combinations of features and questionnaire types. For the rsFC analyses, two models were run: one exclusive to females and one exclusive to males, and we employed a data split of 20 % for hold-out testing and 80 % for training, in both the female and male datasets. Each of these models utilized the Youth Self-Report Gender Questionnaire as the target variable in one iteration and the Parent-Report Gender Questionnaire in another. Similarly, for the cortical thickness, cortical volume, and the combined rsFC/cortical thickness analyses, another set of models was executed, following the same structure as the rsFC analyses. The same grid search optimization of the regularization coefficient C using 5F-CV was performed as for the binary SVM classifiers (Supplementary Methods). Each feature was linearly scaled between zero and one across the outer 5F-CV training sets and these scaling parameters were then applied to scale the outer 5F-CV and the independent testing sets (Cui and Gong, 2018, Erus et al., 2015). The imbalance in the distribution of test scores in the Youth Self-Report and the Parent-Report Gender Questionnaires (Supplementary Fig. 1) led to assigning inverse proportion instance weights in the SVR models.
As a robustness check measure for the rsFC SVR analyses, we opted to run linear ridge regressions as performed by Dhamala et al. (2024) using the code they provided online (https://zenodo.org/doi/10.5281/zenodo.10779163). The only adjustment in the script was to use the updated “StandardScaler” to normalize the data, as the “normalize” parameter has been removed from the Scikit-learn (Pedregosa et al., 2011) “Ridge” regression command starting from version 1.0, recommending users handle normalization separately instead.
Assessment and significance of prediction performance
2.5.3
The performance evaluation of the SVM binary classification models encompassed a comprehensive set of metrics (Poldrack et al., 2020), including accuracy, sensitivity, specificity, the area under the receiver operating characteristic curve (AUC), and the Matthews correlation coefficient (MCC). As opposed to accuracy, which can produce overoptimistic inflated results (Akosa, 2017, Flach and Kull, 2015, Gu et al., 2009, Hand and Christen, 2018, Sokolova et al., 2006, Xi and Beer, 2018), the MCC treats both classes symmetrically, penalizing models that perform well on one class but poorly on another, thus providing a balanced evaluation (Chicco et al., 2021, Chicco et al., 2021, Chicco and Jurman, 2020, Chicco and Jurman, 2022, Chicco and Jurman, 2023). It ranges from −1 to + 1, with + 1 indicating a perfect prediction, 0 a prediction no better than random chance, and −1 total disagreement between prediction and observation. This comprehensive suite of metrics ensured a thorough and nuanced evaluation of the binary classification models, capturing various aspects of their performance and facilitating informed decision-making.
The McNemar’s test (McNemar, 1947, Pembury Smith and Ruxton, 2020) was used to evaluate the differences in classification accuracies between the rsFC, cortical thickness, cortical volume, and combined rsFC/cortical thickness SVM binary classification models. A two-proportions z-test was conducted to compare the proportions of correct predictions between the aligned and the unaligned SVM models.
Permutation testing was used to evaluate whether the prediction performance of the SVM binary classification models was significantly better than expected by chance (Mourão-Miranda et al., 2005). The predictive framework was repeated 10,000 times. The coefficient of determination (R^2^) (Poldrack et al., 2020) was used as a method of evaluation of the SVR models’ performance.
Interpreting model feature weights
2.5.4
The transformation proposed by Haufe et al. (2014) was utilized to transform feature weights derived from the linear SVM binary classification models with the aim of enhancing their interpretability and reliability (Chen et al., 2022; Y. Tian and Zalesky, 2021). To enhance interpretability for visualization purposes, the parcel-wise feature weights were consolidated at the network level by calculating their root mean square value and averaging across previously defined canonical functional networks (Gordon et al., 2016, Gordon et al., 2023).
Other statistical analyses
2.6
To ensure that there were no confounders when running the SVM classification, we conducted a series of statistical tests to examine potential differences between males and females, allowing for a more accurate and unbiased analysis. Independent t-tests were performed to compare age and residual in-scanner motion (mean FD) between the two groups, using Student’s t-test when variances were equal and Welch’s t-test when they were not. Additionally, chi-squared tests were applied to assess the distribution of males and females across various categorical variables, including sex/gender alignment, pubertal development stage, and family income.
The National Academies of Science, Engineering, and Medicine (NASEM) issued a report providing recommendations on using race, ethnicity, ancestry, and other descriptors of population stratification (National Academies of Sciences et al., 2023). The NASEM report asserts that race does not have any biological basis and emphasizes that researchers should not “control for race” in brain-based association or prediction studies with non-brain variables. The current study follows these recommendations and therefore does not evaluate race and ethnicity in brain association and prediction models (Bird and Carlson, 2024, Schraiber and Edge, 2024).
To further confirm that the performance differences in our SVM models were driven by sex/gender alignment status, we conducted additional comparisons (Independent Student’s or Welch’s t-tests, depending on variance equality) between the pubertal development stage and the income-to-needs ratio (calculated by dividing participant-reported annual family income by the 2017 federal poverty level for a corresponding family size (U.S. Department of Health and Human Services, 2017)) in correctly classified subjects (true positives and true negatives) versus incorrectly classified subjects (false positives and false negatives). By examining pubertal development and the income-to-needs ratio, we aimed to determine whether the observed differences in classification performance could be attributed to factors beyond sex/gender alignment status.
Spearman's correlations were used to examine the relationship between the rsFC, cortical thickness, and cortical volume sex classification scores (brain profile predictions) and sex/gender alignment scores in order to evaluate the consistency between the adolescents’ brain profiles and sex/gender alignment. We opted for a non-parametric approach due to the non-normal distribution of the sex/gender alignment scores (Supplementary Fig. 1).
Global signal regression (GSR) is a commonly used method for mitigating motion-related artifacts in functional connectivity analyses (Chang and Glover, 2009, Murphy et al., 2013, Power et al., 2014, Zhu et al., 2015). While some have raised concerns that GSR may complicate group comparisons due to differential re-referencing (Gotts et al., 2013, Saad et al., 2012), this interpretation remains debated in the field. To ensure that our findings were not dependent on this preprocessing choice, we repeated SVM classification group-comparison analyses without GSR.
Results
3
Youths’ demographic characteristics
3.1
In this sample, n = 3129, males (n = 1542 [49.3 %]) and females (n = 1587 [50.7 %]) did not differ statistically in age (Table 1) or family income (Supplementary Table 2). Statistically significant sex differences were found in pubertal development scale scores for each puberty stage and residual in-scanner motion (mean FD of retained fMRI frames) (Table 1). Thus, pubertal development stage and mean FD, were used as covariates in all SVM analyses. Demographic details of the aligned and unaligned groups are provided in Supplementary Table 3.
Functional connectivity and neuroanatomical classification of sex
3.2
This study’s first goal was to understand the way in which high dimensional patterns of brain functional connectivity and neuroanatomy reflect sex. The rsFC sex classifier trained on the aligned participants was able to classify unseen aligned participants from the independent testing set as male or female with 85 % accuracy (p < 0.001; Fig. 2 A, 2B). Sensitivity, specificity, AUC, and MCC of the model were 0.83, 0.87, 0.92, and 0.70 respectively. Variations in the functional organization of the visual (visual and medial visual), default mode, dorsal attention, and parietal memory networks, in that order, contributed the most to the model and were therefore relatively more important in predicting participant sex (Fig. 2 C, 2D).Fig. 2. Brain pattern analysis using support vector machine (SVM) learning predicts participant sex based on functional connectivity. (A) SVMs with nested five-fold cross-validation (5F-CV) were used to construct univariate and multivariate models that classified participants as male or female (sex assigned at birth). These models used resting-state functional connectivity (rsFC), cortical thickness, cortical volume, rsFC/cortical thickness, and rsFC/cortical volume combined as predictive features. The receiver operating characteristic (ROC) curve of the rsFC resulting model is depicted. The model classified participants as male or female with 85 % accuracy. Inset histogram shows distribution of permuted accuracies. The accuracy from real (nonpermuted) data is represented by the dashed red line. (B) McNemar’s tests comparing the rsFC, cortical thickness, cortical volume, and combined rsFC/cortical thickness and rsFC/cortical volume sex classifiers revealed a statistically significant difference in performance between rsFC and cortical thickness (χ^2^ = 7.36, p < 0.01) and between rsFC and cortical volume (χ^2^ = 19.67, p < 0.001). However, the performance differences between cortical thickness and cortical volume, rsFC and rsFC/cortical thickness, and rsFC and rsFC/cortical volume sex classifiers were not significant. (C) To understand which networks contributed the most to the prediction, the parcel-wise feature weights’ root mean square was calculated and then averaged for each network. The most important features in the rsFC model were found in the visual (visual and medial visual), default mode, dorsal attention, and parietal memory networks. (D) The top 10 % of cortical parcels in terms of feature importance in the rsFC SVM model. The parcel-wise feature weights’ root mean square was calculated for each parcel. rsFC, resting-state functional connectivity; CT, cortical thickness; NS, not significant; DMN, default mode network; VIS, visual network; MEDVIS, medial visual network; FPN, frontoparietal network; DAN, dorsal attention network; LANG, language network; SAL, salience network; PMN, parietal memory network; AMN, action-mode network; PREMOT, premotor network, SMH, somatomotor hand network; SMM, somatomotor mouth network; SMF, somatomotor foot network; SCAN, somato-cognitive action network; AUD, auditory network; CAN, contextual association network; HC, hippocampus; AMG, amygdala; BG, basal ganglia; THAL, thalamus; CERB, cerebellum.Fig. 2
The cortical thickness sex classifier was trained on the same aligned participants as the rsFC classifier. It correctly separated aligned males from females with an accuracy of 76 % (p < 0.001; Fig. 2B). Sensitivity, specificity, AUC, and MCC of the model were 0.77, 0.74, 0.83, and 0.52 respectively. Variation in the anatomical organization of cortical parcels belonging to the frontoparietal, dorsal attention, visual (medial visual and visual), and action-mode networks, in that order, contributed the most to the model and were therefore relatively more important in predicting participant sex.
Similarly, the cortical volume sex classifier was trained on the same aligned participants as the rsFC and cortical thickness classifiers. It correctly separated aligned males from females with an accuracy of 70 % (p < 0.001; Fig. 2B). Sensitivity, specificity, AUC, and MCC of the model were 0.68, 0.72, 0.79, and 0.39 respectively. Variation in the anatomical organization of cortical parcels belonging to the dorsal attention, frontoparietal, visual, premotor, and somatomotor mouth networks, in that order, contributed the most to the model and were therefore relatively more important in predicting participant sex.
The combined rsFC/cortical thickness sex classifier was similarly trained on the same set of aligned participants. It correctly separated aligned males from females with an accuracy of 85 % (p < 0.001; Fig. 2B). Sensitivity, specificity, AUC, and MCC of the model were 0.83, 0.87, 0.92, and 0.69 respectively. Variations in the organization of the medial visual, default mode, visual, dorsal attention, and frontoparietal networks, in that order, contributed the most to the model and were therefore relatively more important in predicting participant sex.
The combined rsFC/cortical volume, similarly, trained on the same set of aligned participants, correctly separated aligned males from females with an accuracy of 85 % (p < 0.001; Fig. 2B). Sensitivity, specificity, AUC, and MCC of the model were 0.84, 0.87, 0.92, and 0.70 respectively. Variations in the organization of the visual, dorsal attention, frontoparietal, action-mode, and default mode networks, in that order, contributed the most to the model and were therefore relatively more important in predicting participant sex.
The rsFC sex classifier exhibited significantly superior performance in predicting aligned individuals compared to the cortical thickness classifier (McNemar’s test: χ^2^ = 7.36, p < 0.01; Fig. 2B) and the cortical volume classifier (McNemar’s test: χ^2^ = 19.67, p < 0.001; Fig. 2B). While the cortical thickness sex classifier achieved numerically higher accuracy compared to the cortical volume classifier, that difference did not reach statistical significance (McNemar’s test: χ^2^ = 2.87, p = 0.09; Fig. 2B). The combined rsFC/cortical thickness and rsFC cortical volume sex classifiers performed equally well as the rsFC classifier (both McNemar’s tests: χ^2^ < 0.001, p = 0.99; Fig. 2B) and both sex classifiers performed significantly better than the cortical thickness (McNemar’s test: χ^2^ = 6.89, p < 0.001; Fig. 2B) and cortical volume (McNemar’s test: χ^2^ = 20.45, p < 0.001; Fig. 2B) sex classifiers respectively. Lastly, the combined rsFC/cortical thickness sex classifier did not perform significantly better than the combined rsFC/cortical volume sex classifier (McNemar’s test: χ^2^ < 0.01, p = 0.93; Fig. 2B).
Sex classifiers’ efficacy in predicting aligned and unaligned individuals
3.3
Although the rsFC sex classifier demonstrated superior predictive performance for sex within the aligned test group, surpassing both the cortical thickness and cortical volume classifiers, we examined all three – as well as the combined rsFC/cortical thickness and rsFC/cortical volume classifiers – to assess their efficacy in predicting the sex of individuals with unaligned sex/gender. The rsFC sex classifier trained on the aligned participants was able to classify unseen unaligned participants as male or female with 78 % accuracy (p < 0.001). Sensitivity, specificity, AUC, and MCC of the model were 0.84, 0.74, 0.88, and 0.56 respectively. The rsFC SVM model predicting the independent aligned group achieved statistically significantly higher prediction accuracy (85 %) than predicting the unaligned group (78 %; z = 2.99, p < 0.01). In both the aligned testing set and the unaligned group, we found no significant differences in pubertal development stage (aligned: z = 0.60, p = 0.55, unaligned: z = 0.57, p = 0.57) or income-to-needs (aligned: t = -1.30, p = 0.19, unaligned: t = -0.49, p = 0.63) between correctly and incorrectly classified subjects.
The cortical thickness sex classifier trained on the aligned participants was able to classify unseen unaligned participants as male or female with 72 % accuracy (p < 0.001; Sensitivity, specificity, AUC, MCC = 0.80, 0.67, 0.81, and 0.45). Although the cortical thickness SVM model achieved higher numerical accuracy for the aligned independent testing set (76 %) compared to the unaligned group (72 %), this difference was not statistically significant (z = 1.78, p = 0.08).
Similarly, the cortical volume sex classifier trained on the aligned participants was able to classify unseen unaligned participants as male or female with 70 % accuracy (p < 0.001; Sensitivity, specificity, AUC, MCC = 0.82, 0.64, 0.80, and 0.44). The cortical volume SVM model had achieved the same numerical accuracy for the aligned independent testing set (70 %) and the unaligned group (70 %), with no statistically significant effect (z = -0.20, p = 0.85).
The combined rsFC/cortical thickness sex classifier trained on the aligned participants was able to classify unseen unaligned participants as male or female with 77 % accuracy (p < 0.001; Sensitivity, specificity, AUC, MCC = 0.85, 0.73, 0.88, and 0.56). Like the rsFC sex classifier, the rsFC/cortical thickness SVM model achieved a statistically significantly higher prediction accuracy for the aligned independent testing set (85 %) compared to the unaligned group (77 %; z = 3.06, p < 0.01).
Lastly, the combined rsFC/cortical volume sex classifier trained on the aligned participants was able to classify unseen unaligned participants as male or female with 78 % accuracy (p < 0.001; Sensitivity, specificity, AUC, MCC = 0.84, 0.74, 0.88, and 0.56). Similar to the rsFC sex classifier, the rsFC/cortical thickness SVM model achieved a statistically significantly higher prediction accuracy for the aligned independent testing set (85 %) compared to the unaligned group (78 %; z = 3.10, p < 0.01).
Just as for aligned individuals, the rsFC sex classifier exhibited significantly superior performance in predicting unaligned individuals compared to the cortical thickness (McNemar’s test: χ^2^ = 8.44, p < 0.01) and cortical volume (McNemar’s test: χ^2^ = 12.19, p < 0.001) sex classifiers. While the cortical thickness sex classifier achieved numerically higher accuracy compared to the cortical volume classifier in predicting unaligned individuals, that difference did not reach statistical significance (McNemar’s test: χ^2^ = 0.30, p = 0.58). The combined rsFC/cortical thickness and rsFC cortical volume sex classifiers performed equally well as the rsFC classifier (McNemar’s tests: χ^2^ = 0.03, p = 0.86 and χ^2^ < 0.01, p = 0.96) and both sex classifiers performed significantly better than the cortical thickness (McNemar’s test: χ^2^ = 7.20, p < 0.01) and cortical volume (McNemar’s test: χ^2^ = 12.19, p < 0.001) sex classifiers respectively. Lastly, the combined rsFC/cortical thickness sex classifier did not perform significantly better than the combined rsFC/cortical volume sex classifier (McNemar’s test: χ^2^ = 0.03, p = 0.86).
Since combining rsFC and cortical thickness did not significantly improve the rsFC classification model, the separate rsFC, cortical thickness, and cortical volume models were used for the remainder of the classification-related analyses (see 3.4 Sex classification and sex/gender alignment correlation).
The validation analyses, in which rsFC, cortical thickness, cortical volume, and combined rsFC/cortical thickness and rsFC/cortical volume SVM models were trained on unaligned participants and tested on unaligned and aligned participants, revealed a pattern of results consistent with the main analyses (see Supplementary Results). The enhanced performance accuracy of the rsFC SVM models trained exclusively on aligned participants likely stems from the use of more precisely defined labels, which minimize label noise and enable the classifier to more effectively distinguish between sexes. In contrast, the smaller sample size in the unaligned training group (n = 893 vs n = 1610 aligned training group; Supplementary Fig. 2) likely reduced statistical power, potentially limiting the ability to detect significant differences in performance across models.
The rsFC sex classifier trained on the aligned participants using the data preprocessed without GSR was able to classify unseen aligned participants from the independent testing set as male or female with 83.4 % accuracy (p < 0.001). Sensitivity, specificity, AUC, and MCC of the model were 0.82, 0.85, 0.92, and 0.67 respectively. Variations in the functional organization of the premotor, somato-cognitive action, auditory, and visual (visual and medial visual) networks, in that order, contributed the most to the model and were therefore relatively more important in predicting participant sex. The same classifier was able to classify unseen unaligned participants as male or female with 78.3 % accuracy (p < 0.001). Sensitivity, specificity, AUC, and MCC of the model were 0.85, 0.75, 0.88, and 0.57 respectively. The rsFC SVM model predicting the independent aligned group achieved statistically significantly higher prediction accuracy (83.4 %) than predicting the unaligned group (78.3 %; z = 2.16, p = 0.03).
Sex classification and sex/gender alignment correlation
3.4
The above findings indicate robust differences in brain functional connectivity (rsFC) and neuroanatomy (cortical thickness and cortical volume) between males and females. The relationship between the rsFC, cortical thickness, and cortical volume sex classification scores (brain profile predictions) and sex/gender alignment scores was then investigated to evaluate the consistency between the adolescents’ brain profiles and sex/gender alignment. The classification scores derived from the rsFC, cortical thickness, and cortical volume sex classifiers serve as quantitative indicators of the degree to which a brain matches either female or male patterns. Higher scores suggest a stronger correspondence to the neural connectivity patterns associated with the sex commonly labeled as either female or male. Α significant positive correlation was revealed in females between the rsFC classification testing set scores and the sex/gender alignment scores obtained from the Youth Self-Report Gender Questionnaire (ρ = 0.18, p < 0.001, Fig. 3 A). This suggests that the females with functional connectivity patterns more similar to the typical female pattern exhibit greater sex/gender alignment as reflected in higher scores on the Youth Self-Report Gender Questionnaire. In males, the inverse association was observed, with individuals with functional connectivity patterns more similar to the typical male pattern exhibiting lower sex/gender alignment (ρ = −0.11, p < 0.01; Fig. 3B). Nearly identical correlations between the non-GSR rsFC classification testing set scores and the sex/gender alignment scores were observed in both females (ρ = 0.17, p < 0.001) and males (ρ = −0.11, p < 0.01).Fig. 3. Correlations between resting-state functional connectivity (rsFC) support vector machine (SVM) classification scores and sex/gender alignment scores. (A) A significant positive correlation was observed in females between the rsFC sex classification scores and the sex/gender alignment scores. (B) A significant negative correlation between the rsFC classification and sex/gender alignment scores was observed in males. Higher sex/gender alignment scores (x-axis) indicate greater sex/gender alignment. Higher SVM scores (y-axis) indicate a stronger correspondence to the neural connectivity patterns associated with the sex labeled as either female or male. Results in bold indicate statistical significance.Fig. 3
The correlation between cortical thickness classification and sex/gender alignment scores was not significant for females (ρ = 0.04, p = 0.24) but was significant for males (ρ = −0.09, p = 0.03) exhibiting the same inverse relationship as with the rsFC brain patterns. Similarly, the correlation between cortical volume classification and sex/gender alignment scores was not significant for females (ρ = 0.04, p = 0.24) but was significant for males (ρ = −0.09, p = 0.03).
Functional connectivity and neuroanatomical prediction of sex/gender alignment
3.5
Next, we evaluated whether an adolescent's degree of sex/gender alignment can be predicted by brain neuroanatomy or functional connectivity using linear SVR models. Neither the functional connectivity (rsFC), neuroanatomy (cortical thickness and cortical volume), nor the combined rsFC/cortical thickness and rsFC/cortical volume sex/gender alignment SVR models successfully predicted the sex/gender alignment Youth Self-Report or Parent-Report Gender Questionnaire scores. Correlations between the original sex/gender alignment scores and predicted sex/gender alignment scores were not significant (all p-values > 0.05), except for the correlation between original parent-reported sex/gender alignment scores and predicted scores in males (p = 0.03). Importantly, all coefficients of determination were negative (R² < 0), including the exception (R² = −0.41), suggesting that the models performed worse than a null model predicting the mean of the dependent variable (sex/gender alignment scores) for all observations.
As a robustness check measure, linear ridge regressions, as performed by Dhamala et al. (2024) to explore functional brain correlates of sex and gender, were also tested. The findings were the same as in our rsFC SVR model. Correlations between the original Youth Self-Report and Parent-Report Gender Questionnaire gender scores and predicted gender scores were not significant (all p-values > 0.05) and all coefficients of determination were negative (R^2^ < 0).
Discussion
4
In this study, we leveraged machine learning and a large sample of youths to determine whether variation in functional connectivity and neuroanatomy can predict sex (assigned at birth) and/or sex/gender alignment, and whether this variability is reflected in the degree of sex/gender alignment. Using complex univariate and multivariate patterns of rsFC, cortical thickness, and cortical volume we were able to predict an unseen participant’s sex with a high degree of accuracy. We first demonstrated that rsFC is a better predictor of sex compared to cortical thickness and cortical volume. Combining rsFC with cortical thickness or rsFC with cortical volume did not enhance the rsFC model’s performance. We then demonstrated that while sex differences in brain functional connectivity and neuroanatomy were most pronounced in visual networks, the majority occurred within association networks. Moreover, SVM analyses revealed that the rsFC of sex/gender aligned youth is more effective at predicting an unseen aligned participant’s sex compared to predicting the sex of an unaligned participant. In contrast, the cortical thickness and cortical volume SVM classifiers failed to distinguish between aligned and unaligned participants. Furthermore, we found a positive relationship between the degree to which an individual's rsFC profile reflects female traits and sex/gender alignment in females, but a negative relationship in males. Lastly, we found that neither functional connectivity nor neuroanatomy predict sex/gender alignment.
rsFC outperforms cortical thickness and cortical volume as a predictor of sex
4.1
One of our principal findings is that rsFC demonstrates superior predictive power compared to cortical thickness and cortical volume in accurately classifying sex, although all modalities showed a good degree of prediction. While the cortical thickness sex classifier achieved numerically higher accuracy compared to the cortical volume classifier, that difference was not statistically significant. Prior studies have demonstrated similar accuracies using network-level resting-state fMRI data to discriminate between male and female youth (Shanmugan et al., 2022), achieving slightly higher accuracies when combined with anatomical data (Adeli et al., 2020, Brennan et al., 2021, Kurth et al., 2020, Sepehrband et al., 2018). In our analysis, however, combining rsFC and cortical thickness was better than cortical thickness alone, but did not show a significant improvement in prediction accuracy compared to the rsFC-only model. The same was observed with cortical volume, where combining rsFC with cortical volume did not significantly enhance prediction accuracy over the rsFC-only model. The superior predictive power of rsFC suggests that, even in a young cohort, neural activity patterns during rest capture subtle, sex-specific functional dynamics that are not evident in structural measures like cortical thickness. This highlights rsFC's potential as a more sensitive biomarker in neurodevelopmental studies. Additionally, the fact that rsFC outperforms cortical thickness and cortical volume as a predictor could imply that the information provided by cortical thickness and cortical volume overlaps with what is already captured by rsFC. As a result, cortical thickness and cortical volume may not offer additional or unique information to enhance prediction accuracy.
Sex differences in brain functional connectivity and neuroanatomy are greatest in association networks
4.2
Our findings reveal significant sex-related variability in brain functional connectivity and neuroanatomy, particularly within association networks, an observation consistent with prior research (Bijsterbosch et al., 2018, Cui et al., 2020, Gordon et al., 2017, Kong et al., 2019, Li et al., 2017, Shanmugan et al., 2022). The most pronounced differences in rsFC between sexes were found in visual (visual and medial visual), default mode, dorsal attention, and parietal memory networks (Filippi et al., 2013, Mathew et al., 2020, Murray et al., 2018, Shaqiri et al., 2018, Vanston and Strother, 2017, Weis et al., 2019). Studies suggest that men and women may process visual information differently (Mathew et al., 2020, Murray et al., 2018, Shaqiri et al., 2018, Vanston and Strother, 2017), though neuroimaging research in adults presents mixed results – some identify sex differences in sensory networks like the visual network (Biswal et al., 2010, Filippi et al., 2013), while others do not (Allen et al., 2011, Tomasi and Volkow, 2012). The functional organization of association networks, which are linked to emotional, social, and executive functions (Cui et al., 2020, Kong et al., 2019), also demonstrates variability tied to sex differences (Alarcón et al., 2020, Andreano and Cahill, 2009, Cahill, 2010). For instance, variations in default mode network connectivity have been associated with sex hormones such as estrogen and progesterone (Engman et al., 2016, Hidalgo-Lopez et al., 2020, Petersen et al., 2014, Weis et al., 2019). Our findings further support the role of the dorsal attention and parietal memory networks in predicting sex differences, which also corresponds with prior research (Dumais et al., 2018, Filippi et al., 2013, Hill et al., 2014, Hjelmervik et al., 2014, Stumme et al., 2020, Zhang et al., 2018). These sex differences in rsFC could relate to variations in cognitive processes and behavioral outcomes observed between males and females, influencing domains such as decision-making, emotion regulation, and attention allocation (Stoica et al., 2021). Our results partially align with the work of Dhamala et al. (2024), who also utilized the ABCD dataset to explore functional brain correlates of sex and gender, finding that sex is strongly associated with connectivity patterns within and between the somatomotor, visual, control, and limbic networks (Yeo et al., 2011). However, it is also possible that societal mores that influence engagement with certain cognitive activities also shape brain functional connectivity, leading to patterns that differ between sexes that are not biologically determined.
The discrepancy in networks contributing most to classification between the GSR and non-GSR models (visual, medial visual, default mode, dorsal attention, and parietal memory networks with GSR; premotor, somato-cognitive action, auditory, visual, and medial visual without GSR) likely reflects the impact of GSR on the covariance structure of large-scale networks. By attenuating widespread global variance, GSR can enhance network-specific contrasts and increase the apparent contribution of association networks (e.g., default mode, dorsal attention, and parietal memory), whereas non-GSR data retain more global and sensorimotor-driven variance, increasing the apparent importance of sensory and motor networks (e.g., premotor, auditory, somato-cognitive action).
rsFC of sex/gender aligned youth is less accurate in predicting unaligned youths’ sex
4.3
We found that that observed differences in rsFC between males and females may not entirely be driven by biology (sex) but also from the influence of social constructs (gender) (Eliot, 2011, Eliot et al., 2021). Using a machine learning neuroimaging approach, we showed that an SVM classifier trained to learn sex from rsFC data using only youth with sex/gender alignment, was significantly more accurate in predicting sex in unseen participants with sex/gender alignment than in those with sex/gender unalignment. We further demonstrated that the performance differences in our rsFC SVM model were driven by sex/gender alignment status, as there were no significant differences in pubertal development stage or income-to-needs ratio between correctly classified (true positives and true negatives) and incorrectly classified (false positives and false negatives) subjects. These findings suggest subtle yet detectable differences in rsFC patterns between individuals with aligned and unaligned sex and gender. Notably, cortical thickness and cortical volume SVM classifiers trained to learn sex using only youth with sex/gender alignment failed to differentiate effectively, as they were equally accurate in predicting sex in unseen participants with sex/gender alignment and those with unalignment. This might imply that gender has greater influence on functional connectivity brain patterns than on brain neuroanatomy (i.e., cortical thickness and cortical volume). These results could be considered distinct from research suggesting that the structural brain profile of hormonally untreated transgender individuals resembles that of their gender identity or is intermediate between sexes (Flint et al., 2020, Kranz et al., 2014, Kurth et al., 2022, Mueller et al., 2017, Mueller et al., 2017, Rametti et al., 2011, Rametti et al., 2011). However, those studies reported on white matter microstructure and gray matter volume of subcortical structures, and not cortical thickness or cortical volume.
The reduced accuracy of the rsFC classifier when applied to gender-diverse youth highlights that the neurobiological patterns linked to sex in youth with sex/gender alignment do not generalize well to those with sex/gender unalignment. This underscores the need to consider gender diversity in neuroscientific research and cautions against relying on brain-based classifiers trained exclusively on sex/gender aligned individuals.
Female youth with more sex-specific brain patterns exhibit greater sex/gender alignment
4.4
We found that females with higher rsFC sex SVM scores (greater brain “femaleness” on the “maleness-femaleness” continuum) also had higher sex/gender alignment scores on the Youth Self-Report Gender Questionnaire (Fig. 3 A). In males, however, there was a surprising negative correlation: males with higher rsFC sex SVM scores (greater brain “maleness”) showed lower sex/gender alignment (Fig. 3B). One possible explanation for this discrepancy may be linked to lower social tolerance for gender nonconformity in young males compared to females (Potter et al., 2021, Potter et al., 2022). The sex/gender alignment scores from the Youth Self-Report Gender Questionnaire rely on self-reported responses. Males, even from preadolescence, might feel pressured to answer gender-related questions less truthfully. Consequently, males with rsFC brain patterns less similar to the typical male pattern might still report high sex/gender alignment scores. Thus, the groups of males with higher sex/gender alignment scores may include participants whose sex and gender are not actually aligned. Two observations reinforce this possibility: 1) There were significantly more aligned males than females in the sample (1172 vs. 873; Table 1), and Supplementary Table 2) unaligned females had overall lower Youth Self-Report Gender Questionnaire scores compared to unaligned males (Supplementary Fig. 1), both patterns also observed in the entire ABCD dataset (Potter et al., 2022). A question is thus raised of whether the Youth Self-Report Gender Questionnaire is measuring gender equally effectively for both males and females.
While rsFC successfully captured the relationship between brain patterns and sex/gender alignment in both males and females, cortical thickness and cortical volume only showed this relationship in males, reflecting the same negative correlation. The lack of a significant correlation between the cortical thickness and cortical volume sex SVM scores and sex/gender alignment scores in females underscores the cortical thickness’s and cortical volume’s inability to effectively capture the relationship between adolescents’ brain profiles and sex/gender alignment.
Functional connectivity and neuroanatomy do not predict sex/gender alignment
4.5
Neither the rsFC, cortical thickness, cortical volume, nor combined rsFC/cortical thickness and rsFC/cortical volume SVR models were able to capture the variance in sex/gender alignment. A recent study by Dhamala et al. (2024) demonstrated that sex/gender alignment, as measured by the Parent-Report Gender Questionnaire, could be predicted using rsFC and linear ridge regression. Although both our study and theirs used the ABCD dataset, our findings do not replicate theirs. To assess the robustness of our findings, we also implemented linear ridge regression using online scripts provided by Dhamala et al. (2024). Similar to their approach, separate models were constructed for males and females. rsFC was used as the predictor and sex/gender alignment scores were used as labels. Our robustness check results further confirmed that rsFC is unable to predict sex/gender alignment, as measured by the Youth Self-Report and Parent-Report Gender Questionnaires. The divergence in findings could be attributed to several differences in our analyses including the use of different cortical (400 (Schaefer et al., 2018) vs. 333 (Gordon et al., 2016)) and noncortical (19 (Fischl et al., 2002) vs. 61 (Seitzman et al., 2020)) parcels, and choices that led to different final ABCD samples (e.g., Dhamala et al. 2024 excluded individuals with less than 4 min of data – final sample n = 4757, while we excluded subjects with less than 8 min of data – final sample n = 3129). Additionally, they bandpass filtered at 0.009 Hz ≤ f ≤ 0.08 Hz while we did at 0.008 Hz < f < 0.10 Hz.
Although rsFC effectively captures the relationship between sex and gender (significant accuracy difference when predicting aligned vs. unaligned individuals), the inability to straightforwardly predict sex/gender alignment from brain biomarkers suggests that gender may be a more complex construct that is not as clearly reflected in functional connectivity, cortical thickness or cortical volume patterns. Another possibility would be that linear regression models may not be able to capture potential non-linear gender effects. Additionally, the challenge of capturing variance in gender could be attributed to the overall scarcity of population-level variability in gender scores. It is possible that developmental factors play a role in this phenomenon, and the results might differ if we were examining a population at a more advanced pubertal stage. The inability to replicate findings from a study that reported sex/gender alignment prediction from rsFC using the same dataset suggests that more research is necessary to confirm the robustness and reliability of neural predictors of gender versus sex.
Limitations and future directions
4.6
This study has certain limitations that should be acknowledged. First, sex was assessed using a binary parent-report question. It is important to acknowledge that existing evidence and theory suggest that binary sex classification may be suboptimal, since human brains are largely composed of unique mosaics of female-typical and male-typical features (Fine, 2014, Joel, 2020). Second, the self-reported gender data displayed minimal variability, meaning that the self-reported gender data showed limited diversity in responses (Supplementary Fig. 1), with most participants reporting gender identities aligning with their sex assigned at birth. This lack of variability likely restricted the range of sex/gender alignment values in the dataset, which in turn may have reduced the ability to detect potential relationships between functional connectivity, neuroanatomy, and the degree of sex/gender alignment. Future work should include individuals with varying levels of gender nonconformity to enhance the ability to capture these relationships. Additionally, a deeper investigation into the relative contributions of puberty stage and rsFC to sex/gender alignment, as well as potential interactions with gender roles, would be an important and valuable direction for future research. This would help clarify how these factors interact and contribute to the broader understanding of sex/gender alignment and its neural correlates. Third, we chose to dichotomize sex/gender alignment and variable dichotomization has recognized pitfalls (Maccallum et al., 2002). Dichotomizing can obscure subtle gradations in sex/gender alignment and potentially misrepresent differences between groups. However, we only dichotomized sex/gender alignment for use in our SVM models, which aimed to predict sex. From a machine learning perspective, training SVM classifiers only on aligned individuals–who represent a more homogeneous subpopulation in terms of sex/gender alignment–reduces label noise and enables the classifier to more accurately distinguish between sexes. Establishing this baseline performance with optimized label quality provides a benchmark for evaluating classifier efficacy under more heterogeneous conditions. This approach was intended to explore whether brain structure or function could, indirectly, capture sex/gender alignment. Importantly, for our SVR analyses, where the objective was to directly predict sex/gender alignment, we utilized the continuous sex/gender alignment measure. Fourth, there is a possibility that the Youth Self-Report Gender Questionnaire may not be measuring gender equally effectively for both males and females. This is based on the different score distribution between males and females which could be due to greater social tolerance to nonconformity in young females compared to males (Potter et al., 2021, Potter et al., 2022). Future studies need to consider a measuring tool that captures gender identity, dysphoria, and expression more effectively for both sexes. Fifth, gender is shaped significantly by social and cultural factors, which can influence how gender identity is expressed and understood across different societies. The ABCD dataset, being collected entirely in the United States, reflects only the experiences and environments of individuals within this specific cultural and societal context, which may not be representative of the world population (Ricard et al., 2022). Additionally, gender identity is often intertwined with a variety of societal factors such as socioeconomic status, exposure to adversity, and family dynamics (Barboza-Salerno and Meshelemiah, 2024, Carpenter et al., 2020, Price et al., 2023), all of which can contribute to differences in brain development. Because of these complexities, future research should explore whether the relationship between brain development and gender is consistent across different cultures, taking into account the multitude of factors (e.g., adversity, societal expectations, exposure to different cultural gender norms) that may mediate or moderate these associations. This will be crucial for determining whether observed differences in model performance across sexes are truly gender-related or whether they are driven by other social, environmental, and cultural factors. Lastly, this study provided a limited evaluation of subcortical and cerebellar networks. Subsequent analyses should focus on assessing functional and neuroanatomical sex differences in these networks, as they are as crucial as the cortex for understanding human behavior and function.
Conclusions
5
In summary, our study demonstrates that brain functional connectivity (i.e., rsFC) captures the relationship between sex and gender more effectively compared to cortical thickness and cortical volume, as well as the extent to which sex/gender alignment correlates with a brain sex profile. While we observed normative sex differences in functional connectivity, cortical thickness, and cortical volume in youth, functional connectivity proved to be a more sensitive predictor of sex, with association networks playing a central role in these differences. Additionally, we found limitations in using functional connectivity and neuroanatomy to predict sex/gender alignment, highlighting the complexity of these concepts. Our findings challenge binary models of brain-sex differences, advocating for more inclusive and nuanced approaches to studying the relationship between gender and neurobiology.
CRediT authorship contribution statement
Julia Monk: Writing – review & editing. Roselyne J. Chauvin: Writing – review & editing. Kristen M. Scheidter: Writing – review & editing. Athanasia Metoki: Writing – review & editing, Writing – original draft, Visualization, Validation, Supervision, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. Vahdeta Suljic: Writing – review & editing. Aristeidis Sotiras: Writing – review & editing, Supervision, Methodology. Samuel R. Krimmel: Writing – review & editing. Deanna M. Barch: Writing – review & editing, Supervision, Methodology. Benjamin P. Kay: Writing – review & editing. Nadeshka J. Ramirez-Perez: Writing – review & editing. Timothy O. Laumann: Writing – review & editing. Forrest I. Whiting: Writing – review & editing. Evan M. Gordon: Writing – review & editing. Noah J. Baden: Writing – review & editing. Andrew N. Van: Writing – review & editing. Babatunde Adeyemo: Validation. Anxu Wang: Writing – review & editing. Nico U.F. Dosenbach: Writing – review & editing, Supervision, Methodology. Scott Marek: Writing – review & editing.
Declaration of Competing Interest
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests. E.M.G. may receive royalty income based on technology developed at Washington University School of Medicine and licensed to Turing Medical Inc. N.U.F.D. has a financial interest in Turing Medical Inc. and may benefit financially if the company is successful in marketing Framewise Integrated Real-Time Motion Monitoring (FIRMM) software products. N.U.F.D. may receive royalty income based on FIRMM technology developed at Washington University School of Medicine and Oregon Health and Sciences University and licensed to Turing Medical Inc. N.U.F.D. is a co-founder of Turing Medical Inc. TOL is a consultant for Turing Medical Inc. TOL holds a patent for taskless mapping of brain activity licensed to Sora Neurosciences and a patent for optimizing targets for neuromodulation, implant localization, and ablation is pending. These potential conflicts of interest have been reviewed and are managed by Washington University School of Medicine. The other authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Adeli E.Zhao Q.Zahr N.M.Goldstone A.Pfefferbaum A.Sullivan E.V.Pohl M.Deep learning identifies morphological determinants of sex differences in the pre-adolescent brain Neuro Image 223202011729310.1016/j.neuroimage.2020.117293 PMC 778084632841716 · doi ↗ · pubmed ↗
- 2Akosa J.Predictive accuracy: a misleading performance measure for highly imbalanced data Proc. SAS Glob. Forum 12201714
- 3Alarcón G.Morgan J.K.Allen N.B.Sheeber L.Silk J.S.Forbes E.E.Adolescent gender differences in neural reactivity to a friend’s positive affect and real-world positive experiences in social contexts Dev. Cogn. Neurosci.43202010.1016/j.dcn.2020.100779 PMC 718315832510342 · doi ↗ · pubmed ↗
- 4Allen E.A.Erhardt E.B.Damaraju E.Gruner W.Segall J.M.Silva R.F.Havlicek M.Rachakonda S.Fries J.Kalyanam R.Michael A.M.Caprihan A.Turner J.A.Eichele T.Adelsheim S.Bryan A.D.Bustillo J.Clark V.P.Ewing S.W.F.…Calhoun V.D.A baseline for the multivariate comparison of resting-state networks Front. Syst. Neurosci.52011209310.3389/FNSYS.2011.00002/BIBTEXPMC 305117821442040 · doi ↗ · pubmed ↗
- 5Allen J.S.Damasio H.Grabowski T.J.Bruss J.Zhang W.Sexual dimorphism and asymmetries in the gray-White composition of the human cerebrum Neuroimage 18200388089410.1016/S 1053-8119(03)00034-X 12725764 · doi ↗ · pubmed ↗
- 6Andreano J.M.Cahill L.Sex influences on the neurobiology of learning and memory Learn. Mem.164200924826610.1101/LM.91830919318467 · doi ↗ · pubmed ↗
- 7Avila-Varela D.S.Hidalgo-Lopez E.Dagnino P.C.Acero-Pousa I.del Agua E.Deco G.Pletzer B.Escrichs A.Whole-brain dynamics across the menstrual cycle: the role of hormonal fluctuations and age in healthy women Npj Women’S. Health 2120241910.1038/s 44294-024-00012-4 · doi ↗
- 8Bale T.L.Epperson C.N.Sex as a biological variable: who, what, when, why, and how Neuropsychopharmacology 422201638639610.1038/npp.2016.21527658485 PMC 5399243 · doi ↗ · pubmed ↗
