An Intrinsic Nearest Neighbor Analysis of Neural Machine Translation   Architectures

Hamidreza Ghader; Christof Monz

arXiv:1907.03885·cs.CL·July 10, 2019·1 cites

An Intrinsic Nearest Neighbor Analysis of Neural Machine Translation Architectures

Hamidreza Ghader, Christof Monz

PDF

Open Access

TL;DR

This paper investigates the hidden states of transformer and recurrent neural machine translation models using a nearest neighbors approach to analyze their ability to capture lexical semantics and syntactic structures.

Contribution

It introduces an intrinsic analysis method comparing transformer and recurrent models based on nearest neighbor relationships, revealing differences in semantic and syntactic encoding.

Findings

01

Transformers better capture lexical semantics.

02

Recurrent models' backward layer encodes more semantics.

03

Recurrent models' forward layer encodes more context.

Abstract

Earlier approaches indirectly studied the information captured by the hidden states of recurrent and non-recurrent neural machine translation models by feeding them into different classifiers. In this paper, we look at the encoder hidden states of both transformer and recurrent machine translation models from the nearest neighbors perspective. We investigate to what extent the nearest neighbors share information with the underlying word embeddings as well as related WordNet entries. Additionally, we study the underlying syntactic structure of the nearest neighbors to shed light on the role of syntactic similarities in bringing the neighbors together. We compare transformer and recurrent models in a more intrinsic way in terms of capturing lexical semantics and syntactic structures, in contrast to extrinsic approaches used by previous works. In agreement with the extrinsic evaluations in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning in Bioinformatics

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax