KokoroChat: A Japanese Psychological Counseling Dialogue Dataset Collected via Role-Playing by Trained Counselors
Zhiyang Qi, Takumasa Kaneko, Keiko Takamizo, Mariko Ukiyo, Michimasa Inaba

TL;DR
KokoroChat is a high-quality Japanese psychological counseling dialogue dataset created through role-playing by trained counselors, which enhances LLM response quality and authenticity while addressing privacy concerns.
Contribution
This paper introduces KokoroChat, a novel dataset built via role-playing that improves the authenticity and diversity of counseling dialogues for training language models.
Findings
Fine-tuning LLMs with KokoroChat enhances response quality.
The dataset improves automatic evaluation metrics.
Role-playing ensures high-quality, privacy-preserving data.
Abstract
Generating psychological counseling responses with language models relies heavily on high-quality datasets. Crowdsourced data collection methods require strict worker training, and data from real-world counseling environments may raise privacy and ethical concerns. While recent studies have explored using large language models (LLMs) to augment psychological counseling dialogue datasets, the resulting data often suffers from limited diversity and authenticity. To address these limitations, this study adopts a role-playing approach where trained counselors simulate counselor-client interactions, ensuring high-quality dialogues while mitigating privacy risks. Using this method, we construct KokoroChat, a Japanese psychological counseling dialogue dataset comprising 6,589 long-form dialogues, each accompanied by comprehensive client feedback. Experimental results demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsEducational Tools and Methods
