TL;DR
This paper introduces SPINE, a denoising k-sparse autoencoder method that produces more interpretable and effective word embeddings from existing models like GloVe and word2vec, enhancing interpretability and performance.
Contribution
The paper presents a novel autoencoder variant that improves interpretability of word embeddings while maintaining or surpassing existing performance levels.
Findings
Embeddings are significantly more interpretable according to human evaluation.
SPINE embeddings outperform GloVe and word2vec on multiple downstream tasks.
The method effectively enhances interpretability without sacrificing accuracy.
Abstract
Prediction without justification has limited utility. Much of the success of neural models can be attributed to their ability to learn rich, dense and expressive representations. While these representations capture the underlying complexity and latent trends in the data, they are far from being interpretable. We propose a novel variant of denoising k-sparse autoencoders that generates highly efficient and interpretable distributed word representations (word embeddings), beginning with existing word representations from state-of-the-art methods like GloVe and word2vec. Through large scale human evaluation, we report that our resulting word embedddings are much more interpretable than the original GloVe and word2vec embeddings. Moreover, our embeddings outperform existing popular word embeddings on a diverse suite of benchmark downstream tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGloVe Embeddings
