On the Linguistic Representational Power of Neural Machine Translation Models
Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, James, Glass

TL;DR
This paper investigates how neural machine translation models encode linguistic features like morphology, syntax, and semantics across different layers, units, and multilingual settings, revealing their capacity to learn complex language representations.
Contribution
It provides a comprehensive, data-driven analysis of the linguistic information captured by NMT models, highlighting differences across layers, units, and multilingual versus bilingual models.
Findings
Lower layers capture morphology and POS information.
Higher layers encode semantics and long-range dependencies.
Character-based representations better capture morphology.
Abstract
Despite the recent success of deep neural networks in natural language processing (NLP), their interpretability remains a challenge. We analyze the representations learned by neural machine translation models at various levels of granularity and evaluate their quality through relevant extrinsic properties. In particular, we seek answers to the following questions: (i) How accurately is word-structure captured within the learned representations, an important aspect in translating morphologically-rich languages? (ii) Do the representations capture long-range dependencies, and effectively handle syntactically divergent languages? (iii) Do the representations capture lexical semantics? We conduct a thorough investigation along several parameters: (i) Which layers in the architecture capture each of these linguistic phenomena; (ii) How does the choice of translation unit (word, character, or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsInterpretability
