End-to-End Lexically Constrained Machine Translation for Morphologically Rich Languages
Josef Jon, Jo\~ao Paulo Aires, Du\v{s}an Vari\v{s} and, Ond\v{r}ej Bojar

TL;DR
This paper presents a method for lexically constrained neural machine translation that ensures correct inflection of constrained words in morphologically rich languages, improving agreement accuracy without sacrificing overall translation quality.
Contribution
It introduces a training approach that incorporates lemmatized constraints to enable neural models to infer correct word inflections in morphologically complex languages.
Findings
Reduces agreement errors in constrained translations
Improves manual and automatic evaluation scores
Maintains overall translation quality
Abstract
Lexically constrained machine translation allows the user to manipulate the output sentence by enforcing the presence or absence of certain words and phrases. Although current approaches can enforce terms to appear in the translation, they often struggle to make the constraint word form agree with the rest of the generated output. Our manual analysis shows that 46% of the errors in the output of a baseline constrained model for English to Czech translation are related to agreement. We investigate mechanisms to allow neural machine translation to infer the correct word inflection given lemmatized constraints. In particular, we focus on methods based on training the model with constraints provided as part of the input sequence. Our experiments on the English-Czech language pair show that this approach improves the translation of constrained terms in both automatic and manual evaluation by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
