LLMs-in-the-loop Part-1: Expert Small AI Models for Bio-Medical Text Translation
Bunyamin Keles, Murat Gunay, Serdar I.Caglar

TL;DR
This paper presents a novel LLMs-in-the-loop approach to develop small, specialized neural machine translation models for biomedical texts, outperforming larger general models and commercial translation services.
Contribution
It introduces a new methodology combining synthetic data, rigorous evaluation, and agent orchestration to enhance small medical translation models trained on high-quality in-domain data.
Findings
Small models outperform larger LLMs and commercial translation tools.
Synthetic, high-quality domain-specific data improves translation accuracy.
The approach sets a new standard for specialized biomedical translation models.
Abstract
Machine translation is indispensable in healthcare for enabling the global dissemination of medical knowledge across languages. However, complex medical terminology poses unique challenges to achieving adequate translation quality and accuracy. This study introduces a novel "LLMs-in-the-loop" approach to develop supervised neural machine translation models optimized specifically for medical texts. While large language models (LLMs) have demonstrated powerful capabilities, this research shows that small, specialized models trained on high-quality in-domain (mostly synthetic) data can outperform even vastly larger LLMs. Custom parallel corpora in six languages were compiled from scientific articles, synthetically generated clinical documents, and medical texts. Our LLMs-in-the-loop methodology employs synthetic data generation, rigorous evaluation, and agent orchestration to enhance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗aimped/nlp-health-translation-base-en-trmodel· ♡ 1♡ 1
- 🤗aimped/nlp-health-translation-base-tr-enmodel
- 🤗aimped/nlp-health-translation-base-en-esmodel
- 🤗aimped/nlp-health-translation-base-es-enmodel
- 🤗aimped/nlp-health-translation-base-de-enmodel
- 🤗aimped/nlp-health-translation-base-en-demodel
- 🤗aimped/nlp-health-translation-base-en-ptmodel
- 🤗aimped/nlp-health-translation-base-pt-enmodel· ♡ 3♡ 3
- 🤗aimped/nlp-health-translation-base-en-frmodel· ♡ 2♡ 2
- 🤗aimped/nlp-health-translation-base-fr-enmodel
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Biomedical Text Mining and Ontologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Residual Connection · Layer Normalization · Linear Layer · Attention Dropout · Linear Warmup With Linear Decay · Adam · Balanced Selection · Dropout
