All-but-the-Top: Simple and Effective Postprocessing for Word Representations
Jiaqi Mu, Suma Bhat, Pramod Viswanath

TL;DR
This paper introduces a simple postprocessing method that improves the quality of word representations by removing the mean vector and dominant directions, leading to better performance across various NLP tasks.
Contribution
It proposes a straightforward yet effective postprocessing technique for word embeddings that enhances their utility across multiple linguistic tasks.
Findings
Processed embeddings outperform original ones on lexical tasks.
Improved results across sentence-level semantic tasks.
Method works across different languages and embedding algorithms.
Abstract
Real-valued word representations have transformed NLP applications; popular examples are word2vec and GloVe, recognized for their ability to capture linguistic regularities. In this paper, we demonstrate a {\em very simple}, and yet counter-intuitive, postprocessing technique -- eliminate the common mean vector and a few top dominating directions from the word vectors -- that renders off-the-shelf representations {\em even stronger}. The postprocessing is empirically validated on a variety of lexical-level intrinsic tasks (word similarity, concept categorization, word analogy) and sentence-level tasks (semantic textural similarity and { text classification}) on multiple datasets and with a variety of representation methods and hyperparameter choices in multiple languages; in each case, the processed representations are consistently better than the original ones.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
MethodsGloVe Embeddings
