# Artificial intelligence in mental health care: a scoping review of reviews

**Authors:** Mohammad S. Abu-Mahfouz, Sarah AlFehaid, Hala M. Burqan, Rabie Adel El Arab

PMC · DOI: 10.3389/fpsyt.2026.1688043 · Frontiers in Psychiatry · 2026-03-03

## TL;DR

AI is being used in mental health care, but most models are not yet ready for real-world use due to limited validation and ethical concerns.

## Contribution

This paper provides a comprehensive scoping review of reviews on AI in mental health care, highlighting key gaps and challenges.

## Key findings

- AI models show high accuracy under internal validation but perform worse in external testing.
- Conversational agents show small-to-moderate short-term benefits for depression but lack long-term data.
- Adoption is hindered by usability issues, EHR integration, and incomplete governance frameworks.

## Abstract

Artificial intelligence (AI) is rapidly entering mental health care, but most models remain proof-of-concept, with limited external validation and substantial risk of overfitting.

This scoping review of reviews adhered to the PRISMA-ScR checklist and Joanna Briggs Institute guidance. We searched MEDLINE, Embase, PsycINFO, and IEEE Xplore. Eligible publications encompassed systematic, scoping, narrative, integrative, meta-analytic, and patent reviews. Findings were synthesised thematically.

Thirty-one reviews were included. Evidence concentrated on depression and anxiety; schizophrenia, bipolar disorder, perinatal mental health, autism spectrum conditions, older adults, nurses, and allied professionals were under-represented. Across screening, diagnosis/classification, and risk prediction, high accuracy was frequently reported under internal validation; in prior syntheses, typical internal AUCs clustered around ≈0.80–0.88 whereas externally or prospectively validated performance was scarce and typically attenuated. Signals were strongest for narrow, feedback-rich tasks, with greater decay for general-purpose models and longer prediction horizons. Conversational agents produced small-to-moderate short-term improvements in depressive symptoms (SMD ≈0.2–0.6); effects for anxiety and stress were smaller or inconsistent and varied with comparator stringency, follow-up (≤8–12 weeks vs longer), and the degree of human guidance. Most chatbot evaluations were short and small-scale, with few randomized or pragmatic trials and limited data on durability beyond 12 weeks. Real-world implementation was limited; several reviews identified usability and electronic health-record integration as prerequisites for adoption, and explainability alone rarely conferred actionability without clinician training. Ethical readiness was incomplete: privacy and bias were commonly discussed, but accountability, post-deployment monitoring, and crisis-escalation protocols were inconsistently specified. Economic evaluations were uncommon and rarely accounted for integration, maintenance, or re-training costs. Workforce outcomes (literacy, confidence, readiness) were infrequently measured. Internal and external metrics were not pooled.

AI applications span the mental-health care continuum but remain early in translation. Performance that appears strong under internal validation often attenuates on external or prospective testing; symptomatic gains are concentrated in depression/anxiety and may diminish over longer follow-up; and adoption is constrained by usability, EHR integration, and incomplete governance. The cross-review signal highlights consistent gaps in accountability, post-deployment monitoring and crisis escalation, equity reporting, workforce readiness, and life-cycle economics (including integration, monitoring, and re-training). Addressing these gaps through externally validated and monitored deployments, routine content/guardrail audits for chatbots with human escalation, predefined subgroup performance and bias auditing, and implementation strategies that pair explainability with clinician training and measure workforce endpoints would better align the evidence base with safe, effective, and sustainable clinical use.

## Linked entities

- **Diseases:** depression (MONDO:0002050), anxiety (MONDO:0005618), schizophrenia (MONDO:0005090), bipolar disorder (MONDO:0004985)

## Full-text entities

- **Diseases:** anxiety (MESH:D001007), schizophrenia (MESH:D012559), autism spectrum conditions (MESH:D000067877), bipolar disorder (MESH:D001714), depression (MESH:D003866)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12993279/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12993279/full.md

## References

67 references — full list in the complete paper: https://tomesphere.com/paper/PMC12993279/full.md

---
Source: https://tomesphere.com/paper/PMC12993279