From Fully Trained to Fully Random Embeddings: Improving Neural Machine   Translation with Compact Word Embedding Tables

Krtin Kumar; Peyman Passban; Mehdi Rezagholizadeh; Yiu Sing Lau; Qun; Liu

arXiv:2104.08677·cs.CL·April 19, 2022

From Fully Trained to Fully Random Embeddings: Improving Neural Machine Translation with Compact Word Embedding Tables

Krtin Kumar, Peyman Passban, Mehdi Rezagholizadeh, Yiu Sing Lau, Qun, Liu

PDF

Open Access 1 Video

TL;DR

This paper shows that neural machine translation models can operate effectively with partially random embeddings, significantly reducing memory requirements while maintaining or improving translation quality.

Contribution

It introduces a method to use partially random embeddings in NMT, reducing memory usage with minimal performance loss and sometimes surpassing state-of-the-art results.

Findings

01

Random embeddings cause limited performance degradation.

02

Partial task-specific knowledge boosts NMT performance.

03

Achieved 5.3x compression with competitive translation quality.

Abstract

Embedding matrices are key components in neural natural language processing (NLP) models that are responsible to provide numerical representations of input tokens.\footnote{In this paper words and subwords are referred to as \textit{tokens} and the term \textit{embedding} only refers to embeddings of inputs.} In this paper, we analyze the impact and utility of such matrices in the context of neural machine translation (NMT). We show that detracting syntactic and semantic information from word embeddings and running NMT systems with random embeddings is not as damaging as it initially sounds. We also show how incorporating only a limited amount of task-specific knowledge from fully-trained embeddings can boost the performance NMT systems. Our findings demonstrate that in exchange for negligible deterioration in performance, any NMT model can be run with partially random embeddings.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

From Fully Trained to Fully Random Embeddings: Improving Neural Machine Translation with Compact Word Embedding Tables· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis