Word Embedding Dimension Reduction via Weakly-Supervised Feature Selection
Jintang Xue, Yun-Cheng Wang, Chengwei Wei, C.-C. Jay Kuo

TL;DR
This paper introduces WordFS, a weakly-supervised feature selection method for reducing word embedding dimensions, achieving better performance with lower computational costs for NLP tasks.
Contribution
The paper presents a novel weakly-supervised feature selection approach for word embedding dimension reduction, with two variants and improved efficiency over existing methods.
Findings
WordFS outperforms other dimension reduction methods in various NLP tasks.
It achieves comparable or better performance with lower computational costs.
The code for WordFS is publicly released for reproducibility.
Abstract
As a fundamental task in natural language processing, word embedding converts each word into a representation in a vector space. A challenge with word embedding is that as the vocabulary grows, the vector space's dimension increases, which can lead to a vast model size. Storing and processing word vectors are resource-demanding, especially for mobile edge-devices applications. This paper explores word embedding dimension reduction. To balance computational costs and performance, we propose an efficient and effective weakly-supervised feature selection method named WordFS. It has two variants, each utilizing novel criteria for feature selection. Experiments on various tasks (e.g., word and sentence similarity and binary and multi-class classification) indicate that the proposed WordFS model outperforms other dimension reduction methods at lower computational costs. We have released the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text and Document Classification Technologies
MethodsFeature Selection
