# Performance evaluation of generative pre-trained transformer on the National Veterinary Licensing Examination in Japan

**Authors:** Takahiro Kako, Daiki Kato, Takaaki Iguchi, Shiyu Qin, Miki Ando, Shoma Koseki, Hayato Shibahara, Haruka Motoi, Rin Isaka, Namiko Ikeda, Hiroto Toyoda, Takayuki Nakagawa

PMC · DOI: 10.1038/s41598-026-37300-9 · Scientific Reports · 2026-02-16

## TL;DR

This study evaluates how well GPT models perform on Japan's National Veterinary Licensing Examination, showing that some models can pass without translation or complex prompts.

## Contribution

The first evaluation of GPT models on Japan's National Veterinary Licensing Examination using native Japanese input and standard prompts.

## Key findings

- GPT-o3 with Japanese input and normal prompt achieved the highest score on the 74th NVLE.
- Both GPT-o1 and o3 outperformed GPT-4o in the evaluation.
- GPT-o3 exceeded the passing score in all sections of the 75th and 76th NVLE with 92.9% overall.

## Abstract

Generative Pre-trained Transformer (GPT) models, which are large language models based on the transformer architecture, have enabled natural-language interaction with humans. GPT models have demonstrated high scores on National Medical Licensing Examination in various countries with translation. However, their performance on the National Veterinary Licensing Examination (NVLE) in Japan has not yet been explored. In this study, we evaluated GPT-4o, o1, and o3 on the 74th (2023) NVLE in Japan to compare the models, prompt designs (normal vs. optimized), and languages (Japanese vs. English). We then validated the best performance on the 75th (2024) and 76th (2025) NVLE using o3 with Japanese input and the normal prompt. As a result, o3 with Japanese input and the Normal prompt achieved the highest performance on the 74th NVLE, and both o1 and o3 outperformed GPT-4o. Furthermore, the validation tests using the 75th and 76th NVLE showed that o3 exceeded the minimum passing scoring rate in all sections, achieving an overall score of 92.9%. These findings indicate that recent GPT models can reliably answer the Japanese NVLE without requiring translation or elaborate prompt engineering, highlighting their potential as supportive tools in veterinary education and knowledge assistance in Japan.

## Full-text entities

- **Genes:** GPT (glutamic--pyruvic transaminase) [NCBI Gene 2875] {aka AAT1, ALT, ALT1, GPT1, SGPT}
- **Diseases:** NMLE (MESH:D000069279), hallucinations (MESH:D006212)
- **Chemicals:** O3 (MESH:D010126), 4o (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12909958/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12909958/full.md

## References

15 references — full list in the complete paper: https://tomesphere.com/paper/PMC12909958/full.md

---
Source: https://tomesphere.com/paper/PMC12909958