# English-Czech Systems in WMT19: Document-Level Transformer

**Authors:** Martin Popel, Dominik Mach\'a\v{c}ek, Michal Auersperger, Ond\v{r}ej, Bojar, Pavel Pecina

arXiv: 1907.12750 · 2019-07-31

## TL;DR

This paper presents English-Czech neural machine translation systems using document-level Transformer models to improve translation coherence and adequacy, showing modest BLEU score improvements and limited evidence of enhanced lexical coherence.

## Contribution

The paper introduces document-level Transformer models for English-Czech translation, demonstrating modest BLEU improvements and exploring their impact on translation coherence.

## Key findings

- +0.6 BLEU score improvement with document context
- Limited evidence of improved lexical coherence
- Document-level models show potential but need further validation

## Abstract

We describe our NMT systems submitted to the WMT19 shared task in English-Czech news translation. Our systems are based on the Transformer model implemented in either Tensor2Tensor (T2T) or Marian framework.   We aimed at improving the adequacy and coherence of translated documents by enlarging the context of the source and target. Instead of translating each sentence independently, we split the document into possibly overlapping multi-sentence segments. In case of the T2T implementation, this "document-level"-trained system achieves a $+0.6$ BLEU improvement ($p<0.05$) relative to the same system applied on isolated sentences. To assess the potential effect document-level models might have on lexical coherence, we performed a semi-automatic analysis, which revealed only a few sentences improved in this aspect. Thus, we cannot draw any conclusions from this weak evidence.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.12750/full.md

## References

15 references — full list in the complete paper: https://tomesphere.com/paper/1907.12750/full.md

---
Source: https://tomesphere.com/paper/1907.12750