Polish-English medical knowledge transfer: A new benchmark and results

{\L}ukasz Grzybowski; Jakub Pokrywka; Micha{\l} Ciesi\'o{\l}ka; Jeremi I. Kaczmarek; Marek Kubis

arXiv:2412.00559·cs.CL·September 15, 2025

Polish-English medical knowledge transfer: A new benchmark and results

{\L}ukasz Grzybowski, Jakub Pokrywka, Micha{\l} Ciesi\'o{\l}ka, Jeremi I. Kaczmarek, Marek Kubis

PDF

Open Access 5 Datasets 1 Video

TL;DR

This paper introduces a new Polish medical exam dataset with English translations, benchmarking various LLMs and revealing their strengths and limitations in medical knowledge transfer across languages.

Contribution

It presents a novel Polish medical exam benchmark dataset with English translations and systematically evaluates LLMs' performance on this resource.

Findings

01

GPT-4o approaches human performance

02

Models face challenges in cross-lingual translation

03

Performance varies across medical specialties

Abstract

Large Language Models (LLMs) have demonstrated significant potential in handling specialized tasks, including medical problem-solving. However, most studies predominantly focus on English-language contexts. This study introduces a novel benchmark dataset based on Polish medical licensing and specialization exams (LEK, LDEK, PES) taken by medical doctor candidates and practicing doctors pursuing specialization. The dataset was web-scraped from publicly available resources provided by the Medical Examination Center and the Chief Medical Chamber. It comprises over 24,000 exam questions, including a subset of parallel Polish-English corpora, where the English portion was professionally translated by the examination center for foreign candidates. By creating a structured benchmark from these existing exam questions, we systematically evaluate state-of-the-art LLMs, including general-purpose,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

Polish-English medical knowledge transfer: A new benchmark and results· underline

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Text Readability and Simplification

MethodsFocus