Multilingual Clinical NER: Translation or Cross-lingual Transfer?
Xavier Fontaine, F\'elix Gaschi, Parisa Rastin, Yannick Toussaint

TL;DR
This paper compares cross-lingual transfer and translation-based methods for clinical NER in French and German, demonstrating that translation can match transfer performance but requires careful implementation, and monolingual models do not always outperform multilingual ones.
Contribution
It introduces MedNERF, a new French clinical NER dataset, and provides a comprehensive comparison of transfer and translation methods for multilingual clinical NER.
Findings
Translation-based methods can match cross-lingual transfer performance with careful design.
Monolingual clinical language models do not necessarily outperform multilingual models.
Extensive experiments on French and German datasets validate the comparative analysis.
Abstract
Natural language tasks like Named Entity Recognition (NER) in the clinical domain on non-English texts can be very time-consuming and expensive due to the lack of annotated data. Cross-lingual transfer (CLT) is a way to circumvent this issue thanks to the ability of multilingual large language models to be fine-tuned on a specific task in one language and to provide high accuracy for the same task in another language. However, other methods leveraging translation models can be used to perform NER without annotated data in the target language, by either translating the training set or test set. This paper compares cross-lingual transfer with these two alternative methods, to perform clinical NER in French and in German without any training data in those languages. To this end, we release MedNERF a medical NER test set extracted from French drug prescriptions and annotated with the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Interpreting and Communication in Healthcare
MethodsTest
