Neurocircuitry-Inspired Hierarchical Graph Causal Attention Networks for Explainable Depression Identification
Weidao Chen, Yuxiao Yang, Yueming Wang

TL;DR
This paper introduces NH-GCAT, a neurocircuitry-inspired hierarchical graph causal attention network that models depression-specific brain mechanisms at multiple scales, achieving high accuracy and interpretability in depression diagnosis.
Contribution
It presents a novel hierarchical framework integrating neuroscience knowledge with deep learning for explainable depression classification from neuroimaging data.
Findings
Achieved 73.3% accuracy and 76.4% AUROC in depression classification.
Provided neurobiologically meaningful explanations of brain circuit interactions.
Demonstrated state-of-the-art performance on the REST-meta-MDD dataset.
Abstract
Major Depressive Disorder (MDD), affecting millions worldwide, exhibits complex pathophysiology manifested through disrupted brain network dynamics. Although graph neural networks that leverage neuroimaging data have shown promise in depression diagnosis, existing approaches are predominantly data-driven and operate largely as black-box models, lacking neurobiological interpretability. Here, we present NH-GCAT (Neurocircuitry-Inspired Hierarchical Graph Causal Attention Networks), a novel framework that bridges neuroscience domain knowledge with deep learning by explicitly and hierarchically modeling depression-specific mechanisms at different spatial scales. Our approach introduces three key technical contributions: (1) at the local brain regional level, we design a residual gated fusion module that integrates temporal blood oxygenation level dependent (BOLD) dynamics with functional…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
This paper introduces a multi-level modeling framework that includes three hierarchical layers: the region level, the circuit level, and the network level, which together help capture brain functional dynamics from local to global scales. It is also validated on the large-scale REST-meta-MDD dataset, which contains more than 1,600 subjects from 16 research centers.
This paper contains many critical methodological and conceptual flaws, as well as unclear details. 1. The overall framework of the paper is outdated. Many existing studies have already proposed similar approaches. Please refer to related works in IEEE TMI, IEEE JBHI, and MICCAI. 2. Several fundamental assumptions in the paper are problematic, particularly regarding the causal inference in the VLCA module. The variational conditional probability assumptions are incorrectly formulated, and the p
The integration of three levels of information makes sense to me, and I also appreciate the general idea of leveraging neural circuits as prior information. However, this prior knowledge does not seem to be fully utilized or to effectively reflect existing neuroscience evidence.
The experimental evaluation is too weak. First, only a single dataset is used. Why not evaluate on other MDD datasets such as SRPBS, OpenNeuro, or even the UK Biobank? It would also be more convincing to train on one dataset (for example, REST-meta-MDD) and test on another (for example, SRPBS) to assess the generalization ability of the proposed approach. In addition, the comparisons with prior work are neither rigorous nor fair. The results of several state-of-the-art methods appear to be dir
- Figure 4 includes the ROC and PR curves for better performance evaluation - Table 2 includes weighted average values, which makes the performance difference clearer. - The analysis is comprehensive. While the datasets are somewhat limited, the author discussed them in the future works section. - A complete ablation is done in table 3 that details the contribution of each component.
- This seems to be a resubmission of a previously reviewed work, where the authors promised to discuss how the work differentiates itself from related approaches like https://arxiv.org/pdf/2410.18103, https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10230606 in the related works section. As of the current draft, I don’t see this being done. The current related works section is largely the same as the previous draft. The authors seem to briefly touch upon this in Section. A.3. However, there
1. The paper tackles an important topic — enhancing both accuracy and interpretability of GNNs for MDD classification — and makes a solid attempt to integrate biological priors (depression-related circuits) with deep learning. 2. The interpretability analyses (frequency-specific validation, hierarchical circuit visualization, causal inter-circuit analysis) are thorough and align well with known MDD mechanisms. 3. The paper is clearly written and provides extensive quantitative results, includi
1. Unclear module motivation and mapping between equations and architecture. It is difficult to align the mathematical formulations in Section 3 (Equations 1–21) with the modules illustrated in Figure 2. The description of RG-Fusion, HC-Pooling, and VLCA lacks explicit motivation for each design component — for example, why certain fusion mechanisms, Gumbel-Softmax hierarchical assignments, or causal attention structures were chosen. The rationale for these designs should be better explained or
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFunctional Brain Connectivity Studies · Mental Health Research Topics · Machine Learning in Healthcare
