# Evaluating the Supervised and Zero-shot Performance of Multi-lingual   Translation Models

**Authors:** Chris Hokamp, John Glover, Demian Gholipour

arXiv: 1906.09675 · 2019-06-25

## TL;DR

This paper evaluates multilingual translation models' performance in both supervised and zero-shot settings across diverse language pairs, highlighting the impact of different decoder sharing strategies on translation quality.

## Contribution

It provides the largest and most diverse evaluation of multilingual NMT models, comparing various decoder sharing methods for supervised and zero-shot translation.

## Key findings

- Task-specific decoder parameters outperform fully shared decoders.
- Zero-shot translation performance varies significantly with sharing strategies.
- Largest evaluation to date with diverse zero-shot language pairs.

## Abstract

We study several methods for full or partial sharing of the decoder parameters of multilingual NMT models. We evaluate both fully supervised and zero-shot translation performance in 110 unique translation directions using only the WMT 2019 shared task parallel datasets for training. We use additional test sets and re-purpose evaluation methods recently used for unsupervised MT in order to evaluate zero-shot translation performance for language pairs where no gold-standard parallel data is available. To our knowledge, this is the largest evaluation of multi-lingual translation yet conducted in terms of the total size of the training data we use, and in terms of the diversity of zero-shot translation pairs we evaluate. We conduct an in-depth evaluation of the translation performance of different models, highlighting the trade-offs between methods of sharing decoder parameters. We find that models which have task-specific decoder parameters outperform models where decoder parameters are fully shared across all tasks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.09675/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1906.09675/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1906.09675/full.md

---
Source: https://tomesphere.com/paper/1906.09675