# A Test Suite and Manual Evaluation of Document-Level NMT at WMT19

**Authors:** Kate\v{r}ina Rysov\'a, Magdal\'ena Rysov\'a, Tom\'a\v{s} Musil, Lucie, Pol\'akov\'a, Ond\v{r}ej Bojar

arXiv: 1908.03043 · 2019-08-09

## TL;DR

This paper introduces a test suite for evaluating document-level neural machine translation systems, focusing on discourse phenomena, and includes manual error analysis to identify relevant translation issues.

## Contribution

It provides a new test suite and manual evaluation methodology specifically designed for assessing discourse-related translation errors at the document level.

## Key findings

- Identified key translation errors affecting document-level NMT
- Provided a benchmark for future evaluation of discourse phenomena
- Manual analysis highlights common challenges in document translation

## Abstract

As the quality of machine translation rises and neural machine translation (NMT) is moving from sentence to document level translations, it is becoming increasingly difficult to evaluate the output of translation systems.   We provide a test suite for WMT19 aimed at assessing discourse phenomena of MT systems participating in the News Translation Task. We have manually checked the outputs and identified types of translation errors that are relevant to document-level translation.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.03043/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1908.03043/full.md

## References

16 references — full list in the complete paper: https://tomesphere.com/paper/1908.03043/full.md

---
Source: https://tomesphere.com/paper/1908.03043