Encoders Help You Disambiguate Word Senses in Neural Machine Translation

Gongbo Tang; Rico Sennrich; Joakim Nivre

arXiv:1908.11771·cs.CL·May 7, 2020

Encoders Help You Disambiguate Word Senses in Neural Machine Translation

Gongbo Tang, Rico Sennrich, Joakim Nivre

PDF

TL;DR

This paper investigates how neural machine translation components, especially encoders and decoders, contribute to disambiguating ambiguous words, revealing that encoders encode relevant information effectively and self-attention highlights contextual cues.

Contribution

The study provides a detailed analysis of encoder and decoder roles in word sense disambiguation in NMT, emphasizing the importance of encoder hidden states and self-attention mechanisms.

Findings

01

Encoder hidden states outperform word embeddings in disambiguation tasks.

02

Self-attention weights detect ambiguous nouns and focus on context.

03

Decoders also contribute relevant information for disambiguation.

Abstract

Neural machine translation (NMT) has achieved new state-of-the-art performance in translating ambiguous words. However, it is still unclear which component dominates the process of disambiguation. In this paper, we explore the ability of NMT encoders and decoders to disambiguate word senses by evaluating hidden states and investigating the distributions of self-attention. We train a classifier to predict whether a translation is correct given the representation of an ambiguous noun. We find that encoder hidden states outperform word embeddings significantly which indicates that encoders adequately encode relevant information for disambiguation into hidden states. Decoders could provide further relevant information for disambiguation. Moreover, the attention weights and attention entropy show that self-attention can detect ambiguous nouns and distribute more attention to the context.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.