# Lingua Custodia at WMT'19: Attempts to Control Terminology

**Authors:** Franck Burlot

arXiv: 1907.04618 · 2019-07-11

## TL;DR

This paper explores methods to adapt machine translation systems for specific topics by controlling terminology, using constrained decoding and backtranslation to improve translation accuracy of entities in a domain-specific context.

## Contribution

It introduces a novel approach to incorporate terminology constraints into MT systems for domain adaptation without in-domain parallel data.

## Key findings

- Improved translation of political entities and names in German-French MT
- Effective use of constrained decoding with backtranslation for terminology control
- Demonstrated benefits in a real-world shared task setting

## Abstract

This paper describes Lingua Custodia's submission to the WMT'19 news shared task for German-to-French on the topic of the EU elections. We report experiments on the adaptation of the terminology of a machine translation system to a specific topic, aimed at providing more accurate translations of specific entities like political parties and person names, given that the shared task provided no in-domain training parallel data dealing with the restricted topic. Our primary submission to the shared task uses backtranslation generated with a type of decoding allowing the insertion of constraints in the output in order to guarantee the correct translation of specific terms that are not necessarily observed in the data.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.04618/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/1907.04618/full.md

---
Source: https://tomesphere.com/paper/1907.04618