Blending Human and LLM Expertise to Detect Hallucinations and Omissions in Mental Health Chatbot Responses

Khizar Hussain; Bradley A. Malin; Zhijun Yin; Susannah Leigh Rose; Murat Kantarcioglu

arXiv:2604.06216·cs.CL·April 9, 2026

Blending Human and LLM Expertise to Detect Hallucinations and Omissions in Mental Health Chatbot Responses

Khizar Hussain, Bradley A. Malin, Zhijun Yin, Susannah Leigh Rose, Murat Kantarcioglu

PDF

TL;DR

This paper presents a framework combining human expertise and LLMs to improve detection of hallucinations and omissions in mental health chatbot responses, enhancing safety and transparency.

Contribution

It introduces a domain-informed feature extraction framework that significantly improves hallucination and omission detection over traditional LLM judges.

Findings

01

Traditional LLM judges achieve only 52% accuracy in mental health data.

02

The proposed framework achieves up to 0.849 F1 in hallucination detection.

03

Combining human expertise with automated features improves reliability in high-stakes settings.

Abstract

As LLM-powered chatbots are increasingly deployed in mental health services, detecting hallucinations and omissions has become critical for user safety. However, state-of-the-art LLM-as-a-judge methods often fail in high-risk healthcare contexts, where subtle errors can have serious consequences. We show that leading LLM judges achieve only 52% accuracy on mental health counseling data, with some hallucination detection approaches exhibiting near-zero recall. We identify the root cause as LLMs' inability to capture nuanced linguistic and therapeutic patterns recognized by domain experts. To address this, we propose a framework that integrates human expertise with LLMs to extract interpretable, domain-informed features across five analytical dimensions: logical consistency, entity verification, factual accuracy, linguistic uncertainty, and professional appropriateness. Experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.