Do Large Language Models Align with Core Mental Health Counseling   Competencies?

Viet Cuong Nguyen; Mohammad Taher; Dongwan Hong; Vinicius Konkolics; Possobom; Vibha Thirunellayi Gopalakrishnan; Ekta Raj; Zihang Li; Heather J.; Soled; Michael L. Birnbaum; Srijan Kumar; Munmun De Choudhury

arXiv:2410.22446·cs.CL·February 28, 2025·3 cites

Do Large Language Models Align with Core Mental Health Counseling Competencies?

Viet Cuong Nguyen, Mohammad Taher, Dongwan Hong, Vinicius Konkolics, Possobom, Vibha Thirunellayi Gopalakrishnan, Ekta Raj, Zihang Li, Heather J., Soled, Michael L. Birnbaum, Srijan Kumar, Munmun De Choudhury

PDF

Open Access 1 Datasets 1 Video

TL;DR

This paper evaluates whether large language models can meet core mental health counseling competencies using a new benchmark, revealing they perform well in some areas but lack in empathy and ethical reasoning, highlighting the need for specialized models.

Contribution

Introduces CounselingBench, a novel benchmark for assessing LLMs on mental health counseling competencies, and provides a comprehensive evaluation of various models' strengths and weaknesses.

Findings

01

Frontier models surpass minimum thresholds but not expert-level performance.

02

Models perform well in Intake, Assessment & Diagnosis but struggle with empathy and ethics.

03

Medical LLMs do not outperform generalist models in accuracy, but offer better justifications.

Abstract

The rapid evolution of Large Language Models (LLMs) presents a promising solution to the global shortage of mental health professionals. However, their alignment with essential counseling competencies remains underexplored. We introduce CounselingBench, a novel NCMHCE-based benchmark evaluating 22 general-purpose and medical-finetuned LLMs across five key competencies. While frontier models surpass minimum aptitude thresholds, they fall short of expert-level performance, excelling in Intake, Assessment & Diagnosis but struggling with Core Counseling Attributes and Professional Practice & Ethics. Surprisingly, medical LLMs do not outperform generalist models in accuracy, though they provide slightly better justifications while making more context-related errors. These findings highlight the challenges of developing AI for mental health counseling, particularly in competencies requiring…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

ckmjx/CounselingBench
dataset· 41 dl
41 dl

Videos

Do Large Language Models Align with Core Mental Health Counseling Competencies?· underline

Taxonomy

TopicsCounseling Practices and Supervision · Interpreting and Communication in Healthcare · Cultural Competency in Health Care