Neural Machine Translation of Clinical Text: An Empirical Investigation   into Multilingual Pre-Trained Language Models and Transfer-Learning

Lifeng Han; Serge Gladkoff; Gleb Erofeev; Irina Sorokina; Betty; Galiano; Goran Nenadic

arXiv:2312.07250·cs.CL·February 22, 2024·2 cites

Neural Machine Translation of Clinical Text: An Empirical Investigation into Multilingual Pre-Trained Language Models and Transfer-Learning

Lifeng Han, Serge Gladkoff, Gleb Erofeev, Irina Sorokina, Betty, Galiano, Goran Nenadic

PDF

Open Access 1 Repo

TL;DR

This study investigates neural machine translation for clinical texts using multilingual pre-trained models and transfer learning, achieving top performance and revealing that smaller models can outperform larger ones in clinical translation tasks.

Contribution

It demonstrates the effectiveness of transfer learning with multilingual pre-trained models in clinical translation and uncovers that smaller models can outperform larger models in this domain.

Findings

01

Small pre-trained models outperform larger models in clinical translation.

02

Transfer learning effectively adapts models to new languages like Spanish.

03

Models achieved top-level performance in ClinSpEn-2022 shared task.

Abstract

We conduct investigations on clinical text machine translation by examining multilingual neural network models using deep learning such as Transformer based structures. Furthermore, to address the language resource imbalance issue, we also carry out experiments using a transfer learning methodology based on massive multilingual pre-trained language models (MMPLMs). The experimental results on three subtasks including 1) clinical case (CC), 2) clinical terminology (CT), and 3) ontological concept (OC) show that our models achieved top-level performances in the ClinSpEn-2022 shared task on English-Spanish clinical domain data. Furthermore, our expert-based human evaluations demonstrate that the small-sized pre-trained language model (PLM) won over the other two extra-large language models by a large margin, in the clinical domain fine-tuning, which finding was never reported in the field.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hecta-uom/clinicalnmt
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Natural Language Processing Techniques

MethodsMulti-Head Attention · Linear Layer · Residual Connection · Layer Normalization · Dropout · Dense Connections · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Label Smoothing · Byte Pair Encoding