A New NMT Model for Translating Clinical Texts from English to Spanish
Rumeng Li, Xun Wang, Hong Yu

TL;DR
This paper introduces NOOV, a neural machine translation model designed to improve English-Spanish clinical text translation by effectively handling unknown words and leveraging biomedical resources, with minimal in-domain data.
Contribution
The paper presents NOOV, a novel NMT system that combines learned bilingual lexicons and biomedical phrase look-up tables to address unknown words and improve translation quality in clinical texts.
Findings
NOOV outperforms baseline models in accuracy and fluency.
It effectively handles unknown words with minimal in-domain data.
The system enhances phrase generation in clinical translation tasks.
Abstract
Translating electronic health record (EHR) narratives from English to Spanish is a clinically important yet challenging task due to the lack of a parallel-aligned corpus and the abundant unknown words contained. To address such challenges, we propose \textbf{NOOV} (for No OOV), a new neural machine translation (NMT) system that requires little in-domain parallel-aligned corpus for training. NOOV integrates a bilingual lexicon automatically learned from parallel-aligned corpora and a phrase look-up table extracted from a large biomedical knowledge resource, to alleviate both the unknown word problem and the word-repeat challenge in NMT, enhancing better phrase generation of NMT systems. Evaluation shows that NOOV is able to generate better translation of EHR with improvement in both accuracy and fluency.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
