# Mapping Heterogeneity in Psychological Risk Among University Students Using Explainable Machine Learning

**Authors:** Penglin Liu, Ji Tang, Hongxiao Wang, Dingsen Zhang

PMC · DOI: 10.3390/e28020224 · 2026-02-14

## TL;DR

This paper introduces a new machine learning framework to identify distinct psychological risk subtypes among university students, aiming to improve mental health interventions.

## Contribution

The novel 'predict-explain-discover' pipeline combines XAI and unsupervised learning to uncover heterogeneous risk mechanisms in student mental health.

## Key findings

- Three distinct psychological risk subtypes were identified: academically-driven, socio-emotional, and internal regulatory risks.
- Sensitivity analysis confirmed the structural stability of these subtypes based on core features.
- The framework aligns with RDoC and supports precision interventions by bridging predictive accuracy with mechanistic understanding.

## Abstract

In the post-pandemic era, student mental health challenges have emerged as a critical issue in higher education. However, conventional assessment approaches often treat at-risk populations as a monolithic entity, thereby limiting intervention effectiveness. This study proposes a novel computational framework that integrates explainable artificial intelligence (XAI) with unsupervised learning to decode the latent heterogeneity of psychological risk mechanisms. We developed a “predict-explain-discover” pipeline leveraging TreeSHAP and Gaussian Mixture Models to identify distinct risk subtypes based on a 2556-dimensional feature space encompassing lexical, linguistic, and affective indicators. Our approach identified three theoretically-grounded subtypes: academically-driven (28.46%), socio-emotional (43.85%), and internal regulatory (27.69%) risks. Sensitivity analysis using top-20 core features further validated the structural stability of these mechanisms, proving that the subtypes are anchored in the model’s primary decision drivers rather than high-dimensional noise. The framework demonstrates how black-box classifiers can be transformed into diagnostic tools, bridging the gap between predictive accuracy and mechanistic understanding. Our findings align with the Research Domain Criteria (RDoC) and establish a foundation for precision interventions targeting specific risk drivers. This work advances computational mental health research through methodological innovations in mechanism-based subtyping and practical strategies for personalized student support.

## Full-text entities

- **Genes:** SHROOM4 (shroom family member 4) [NCBI Gene 57477] {aka MRXSSDS, SHAP, shrm4}
- **Diseases:** burnout (MESH:D002055), COVID-19 (MESH:D000086382), psychological dysfunction (MESH:D020018), Depression (MESH:D003866), RDoC (MESH:D014947), insomnia (MESH:D007319), Anxiety (MESH:D001007), GMM (MESH:D004195), fatigue (MESH:D005221), distress (MESH:D012128), emotional dysregulation (MESH:D021081), anxiety-related symptoms (MESH:D001008)
- **Chemicals:** t (MESH:D014316)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12939750/full.md

---
Source: https://tomesphere.com/paper/PMC12939750