Reconstruction of Word Embeddings from Sub-Word Parameters
Karl Stratos

TL;DR
This paper introduces a method to reconstruct pre-trained word embeddings solely from sub-word parameters, enabling the use of rich embeddings without increasing model size.
Contribution
It presents a simple approach to optimize sub-word parameters to replicate pre-trained embeddings, reducing model size while maintaining performance.
Findings
Effective reconstruction of word embeddings from sub-word parameters
Improved performance on word similarity and analogy tasks
Comparable results in part-of-speech tagging
Abstract
Pre-trained word embeddings improve the performance of a neural model at the cost of increasing the model size. We propose to benefit from this resource without paying the cost by operating strictly at the sub-lexical level. Our approach is quite simple: before task-specific training, we first optimize sub-word parameters to reconstruct pre-trained word embeddings using various distance measures. We report interesting results on a variety of tasks: word similarity, word analogy, and part-of-speech tagging.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
