Incremental Sense Weight Training for the Interpretation of Contextualized Word Embeddings
Xinyi Jiang, Zhengzhe Yang, Jinho D. Choi

TL;DR
This paper introduces an online algorithm that identifies and discards unimportant dimensions in contextualized word embeddings, improving interpretability without sacrificing task performance, demonstrated through a word sense disambiguation task.
Contribution
The paper presents a novel online method for learning and pruning unimportant embedding dimensions, enhancing interpretability of models like BERT, ELMo, and Flair.
Findings
Masked embeddings maintain or improve WSD performance
Algorithm effectively detects unessential embedding dimensions
3% performance improvement observed with masked embeddings
Abstract
We present a novel online algorithm that learns the essence of each dimension in word embeddings by minimizing the within-group distance of contextualized embedding groups. Three state-of-the-art neural-based language models are used, Flair, ELMo, and BERT, to generate contextualized word embeddings such that different embeddings are generated for the same word type, which are grouped by their senses manually annotated in the SemCor dataset. We hypothesize that not all dimensions are equally important for downstream tasks so that our algorithm can detect unessential dimensions and discard them without hurting the performance. To verify this hypothesis, we first mask dimensions determined unessential by our algorithm, apply the masked word embeddings to a word sense disambiguation task (WSD), and compare its performance against the one achieved by the original embeddings. Several KNN…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning
MethodsLinear Layer · Interpretability · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Weight Decay · Residual Connection · Adam · Bidirectional LSTM · Layer Normalization
