# What do Neural Machine Translation Models Learn about Morphology?

**Authors:** Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, James, Glass

arXiv: 1704.03471 · 2018-10-23

## TL;DR

This paper investigates what neural machine translation models learn about morphology by analyzing their internal representations and evaluating their effectiveness in morphological tasks.

## Contribution

It provides a comprehensive analysis of neural MT representations at different levels and evaluates their capacity to encode morphological information.

## Key findings

- Neural MT models capture meaningful morphological features.
- Representation quality varies with model architecture and layer depth.
- Target language influences the learned morphological representations.

## Abstract

Neural machine translation (MT) models obtain state-of-the-art performance while maintaining a simple, end-to-end architecture. However, little is known about what these models learn about source and target languages during the training process. In this work, we analyze the representations learned by neural MT models at various levels of granularity and empirically evaluate the quality of the representations for learning morphology through extrinsic part-of-speech and morphological tagging tasks. We conduct a thorough investigation along several parameters: word-based vs. character-based representations, depth of the encoding layer, the identity of the target language, and encoder vs. decoder representations. Our data-driven, quantitative evaluation sheds light on important aspects in the neural MT system and its ability to capture word structure.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1704.03471/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1704.03471/full.md

## References

43 references — full list in the complete paper: https://tomesphere.com/paper/1704.03471/full.md

---
Source: https://tomesphere.com/paper/1704.03471