A Benchmark for Understanding Dialogue Safety in Mental Health Support
Huachuan Qiu, Tong Zhao, Anqi Li, Shuai Zhang, Hongliang He, Zhenzhong, Lan

TL;DR
This paper introduces a new benchmark dataset and taxonomy for assessing dialogue safety specifically in mental health support interactions, highlighting the limitations of existing safety approaches and evaluating language models' performance.
Contribution
It develops a theoretically grounded safety taxonomy and creates a fine-grained benchmark dataset tailored for mental health dialogue safety analysis.
Findings
ChatGPT struggles with safety detection in zero- and few-shot settings.
Fine-tuned models outperform zero-shot ChatGPT in safety detection.
The dataset provides a valuable resource for future research on dialogue safety.
Abstract
Dialogue safety remains a pervasive challenge in open-domain human-machine interaction. Existing approaches propose distinctive dialogue safety taxonomies and datasets for detecting explicitly harmful responses. However, these taxonomies may not be suitable for analyzing response safety in mental health support. In real-world interactions, a model response deemed acceptable in casual conversations might have a negligible positive impact on users seeking mental health support. To address these limitations, this paper aims to develop a theoretically and factually grounded taxonomy that prioritizes the positive impact on help-seekers. Additionally, we create a benchmark corpus with fine-grained labels for each dialogue session to facilitate further research. We analyze the dataset using popular language models, including BERT-base, RoBERTa-large, and ChatGPT, to detect and understand…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCardiac Arrest and Resuscitation · Topic Modeling · Mental Health via Writing
