Large Language Model–Based Chatbots and Agentic AI for Mental Health Counseling: Systematic Review of Methodologies, Evaluation Frameworks, and Ethical Safeguards

Ha Na Cho; Jiayuan Wang; Di Hu; Kai Zheng

PMC · DOI:10.2196/80348·March 13, 2026

Large Language Model–Based Chatbots and Agentic AI for Mental Health Counseling: Systematic Review of Methodologies, Evaluation Frameworks, and Ethical Safeguards

Ha Na Cho, Jiayuan Wang, Di Hu, Kai Zheng

PDF

Open Access

TL;DR

This review examines how large language models are being used for mental health chatbots, highlighting gaps in evaluation and ethical practices.

Contribution

The study systematically reviews methodologies, evaluation practices, and ethical frameworks for LLM-based mental health chatbots.

Findings

01

Most studies used GPT or fine-tuned models like LLaMa and Qwen for mental health chatbots.

02

Evaluation methods were mixed, with limited external validation and inconsistent ethical reporting.

03

No study reported registered clinical trials or real-world validation.

Abstract

Large language model (LLM)–based chatbots have rapidly emerged as tools for digital mental health (MH) counseling. However, evidence on their methodological quality, evaluation rigor, and ethical safeguards remains fragmented, limiting interpretation of clinical readiness and deployment safety. This systematic review aimed to synthesize the methodologies, evaluation practices, and ethical or governance frameworks of LLM-based chatbots developed for MH counseling and to identify gaps affecting validity, reproducibility, and translation. We searched Google Scholar, PubMed, IEEE Xplore, and ACM Digital Library for studies published between January 2020 and May 2025. Eligible studies reported original development or empirical evaluation of LLM-driven MH counseling chatbots. We excluded studies that did not involve LLM-based conversational agents, were not focused on counseling or…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Diseases1

Mental Health

Figures4

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Mental Health Interventions · Artificial Intelligence in Healthcare and Education · Mental Health via Writing