Performance Evaluation of Open-Source Large Language Models for Assisting Pathology Report Writing in Japanese

Masataka Kawai; Singo Sakashita; Shumpei Ishikawa; Shogo Watanabe; Anna Matsuoka; Mikio Sakurai; Yasuto Fujimoto; Yoshiyuki Takahara; Atsushi Ohara; Hirohiko Miyake; and Genichiro Ishii

arXiv:2603.11597·cs.CL·March 13, 2026

Performance Evaluation of Open-Source Large Language Models for Assisting Pathology Report Writing in Japanese

Masataka Kawai, Singo Sakashita, Shumpei Ishikawa, Shogo Watanabe, Anna Matsuoka, Mikio Sakurai, Yasuto Fujimoto, Yoshiyuki Takahara, Atsushi Ohara, Hirohiko Miyake, and Genichiro Ishii

PDF

Open Access

TL;DR

This study evaluates seven open-source large language models for Japanese pathology report writing, focusing on diagnosis generation, typo correction, and explanatory text, revealing their potential utility in clinical scenarios.

Contribution

First comprehensive assessment of open-source LLMs for Japanese pathology report support, highlighting their strengths and limitations across different tasks.

Findings

01

Models excelled in structured diagnosis and typo correction tasks.

02

Preferences for explanatory text varied among clinicians.

03

Open-source LLMs show potential in specific clinical report writing scenarios.

Abstract

The performance of large language models (LLMs) for supporting pathology report writing in Japanese remains unexplored. We evaluated seven open-source LLMs from three perspectives: (A) generation and information extraction of pathology diagnosis text following predefined formats, (B) correction of typographical errors in Japanese pathology reports, and (C) subjective evaluation of model-generated explanatory text by pathologists and clinicians. Thinking models and medical-specialized models showed advantages in structured reporting tasks that required reasoning and in typo correction. In contrast, preferences for explanatory outputs varied substantially across raters. Although the utility of LLMs differed by task, our findings suggest that open-source LLMs can be useful for assisting Japanese pathology report writing in limited but clinically relevant scenarios.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection · Artificial Intelligence in Healthcare and Education · Radiology practices and education