Reconstruction of Word Embeddings from Sub-Word Parameters

Karl Stratos

arXiv:1707.06957·cs.CL·July 24, 2017

Reconstruction of Word Embeddings from Sub-Word Parameters

Karl Stratos

PDF

TL;DR

This paper introduces a method to reconstruct pre-trained word embeddings solely from sub-word parameters, enabling the use of rich embeddings without increasing model size.

Contribution

It presents a simple approach to optimize sub-word parameters to replicate pre-trained embeddings, reducing model size while maintaining performance.

Findings

01

Effective reconstruction of word embeddings from sub-word parameters

02

Improved performance on word similarity and analogy tasks

03

Comparable results in part-of-speech tagging

Abstract

Pre-trained word embeddings improve the performance of a neural model at the cost of increasing the model size. We propose to benefit from this resource without paying the cost by operating strictly at the sub-lexical level. Our approach is quite simple: before task-specific training, we first optimize sub-word parameters to reconstruct pre-trained word embeddings using various distance measures. We report interesting results on a variety of tasks: word similarity, word analogy, and part-of-speech tagging.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.