Facilitating Terminology Translation with Target Lemma Annotations

Toms Bergmanis; M\=arcis Pinnis

arXiv:2101.10035·cs.CL·January 26, 2021

Facilitating Terminology Translation with Target Lemma Annotations

Toms Bergmanis, M\=arcis Pinnis

PDF

1 Repo

TL;DR

This paper introduces a source-side data augmentation method that annotates words with target lemmas, enabling machine translation systems to better handle terminology translation in morphologically complex languages, improving accuracy and BLEU scores.

Contribution

It proposes a novel data augmentation approach that allows MT systems to incorporate terminology without requiring pre-inflected target forms, enhancing practical translation applications.

Findings

01

Up to 7 BLEU point improvement over baseline systems.

02

Average 4 BLEU point gain compared to previous methods.

03

47.7% absolute improvement in human term translation accuracy.

Abstract

Most of the recent work on terminology integration in machine translation has assumed that terminology translations are given already inflected in forms that are suitable for the target language sentence. In day-to-day work of professional translators, however, it is seldom the case as translators work with bilingual glossaries where terms are given in their dictionary forms; finding the right target language form is part of the translation process. We argue that the requirement for apriori specified target language forms is unrealistic and impedes the practical applicability of previous work. In this work, we propose to train machine translation systems using a source-side data augmentation method that annotates randomly selected source language words with their target language lemmas. We show that systems trained on such augmented data are readily usable for terminology integration in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tilde-nlp/terminology_translation
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.