JMedEthicBench: A Multi-Turn Conversational Benchmark for Evaluating Medical Safety in Japanese Large Language Models

Junyu Liu; Zirui Li; Qian Niu; Zequn Zhang; Yue Xun; Wenlong Hou; Shujun Wang; Yusuke Iwasawa; Yutaka Matsuo; Kan Hatakeyama-Sato

arXiv:2601.01627·cs.CL·March 31, 2026

JMedEthicBench: A Multi-Turn Conversational Benchmark for Evaluating Medical Safety in Japanese Large Language Models

Junyu Liu, Zirui Li, Qian Niu, Zequn Zhang, Yue Xun, Wenlong Hou, Shujun Wang, Yusuke Iwasawa, Yutaka Matsuo, Kan Hatakeyama-Sato

PDF

TL;DR

JMedEthicBench is a novel multi-turn conversational benchmark designed to evaluate the medical safety of Japanese Large Language Models, revealing vulnerabilities especially in specialized models and across languages.

Contribution

It introduces the first multi-turn Japanese healthcare safety benchmark with adversarial conversations, highlighting safety challenges in medical LLMs and cross-lingual vulnerabilities.

Findings

01

Commercial models show robust safety in tests.

02

Medical-specialized models are more vulnerable.

03

Safety scores decline over conversation turns.

Abstract

As Large Language Models (LLMs) are increasingly deployed in healthcare field, it becomes essential to carefully evaluate their medical safety before clinical use. However, existing safety benchmarks remain predominantly English-centric, and test with only single-turn prompts despite multi-turn clinical consultations. To address these gaps, we introduce JMedEthicBench, the first multi-turn conversational benchmark for evaluating medical safety of LLMs for Japanese healthcare. Our benchmark is based on 67 guidelines from the Japan Medical Association and contains over 50,000 adversarial conversations generated using seven automatically discovered jailbreak strategies. Using a dual-LLM scoring protocol, we evaluate 27 models and find that commercial models maintain robust safety while medical-specialized models exhibit increased vulnerability. Furthermore, safety scores decline…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.