On Creating an English-Thai Code-switched Machine Translation in Medical Domain
Parinthapat Pengpun, Krittamate Tiankanon, Amrest Chinkamol, Jiramet, Kinchagawat, Pitchaya Chairuengjitjaras, Pasit Supholkhan, Pubordee, Aussavavirojekul, Chiraphat Boonnag, Kanyakorn Veerakanjana, Hirunkul, Phimsiri, Boonthicha Sae-jia, Nattawach Sataudom

TL;DR
This paper introduces a code-switched translation approach for English-Thai medical texts, emphasizing the preservation of medical terminology, and demonstrates its effectiveness through automatic metrics and human evaluations.
Contribution
The study presents a novel method to generate code-switched medical translation data and fine-tune translation models to better preserve critical English medical terms in Thai translations.
Findings
CS translation model outperforms baseline models in preserving medical terms
Medical professionals prefer CS translations with accurate English terminology
Model achieves competitive automatic evaluation scores
Abstract
Machine translation (MT) in the medical domain plays a pivotal role in enhancing healthcare quality and disseminating medical knowledge. Despite advancements in English-Thai MT technology, common MT approaches often underperform in the medical field due to their inability to precisely translate medical terminologies. Our research prioritizes not merely improving translation accuracy but also maintaining medical terminology in English within the translated text through code-switched (CS) translation. We developed a method to produce CS medical translation data, fine-tuned a CS translation model with this data, and evaluated its performance against strong baselines, such as Google Neural Machine Translation (NMT) and GPT-3.5/GPT-4. Our model demonstrated competitive performance in automatic metrics and was highly favored in human preference evaluations. Our evaluation result also shows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Translation Studies and Practices
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Cosine Annealing · Attention Dropout · Softmax · Multi-Head Attention · {Dispute@FaQ-s}How to file a dispute with Expedia? · Linear Warmup With Cosine Annealing · Adam
