Domain Lexical Knowledge-based Word Embedding Learning for Text Classification under Small Data
Zixiao Zhu, Kezhi Mao

TL;DR
This paper introduces a method to improve text classification by enhancing BERT embeddings with domain-specific lexical knowledge, especially effective in small data scenarios and tasks like sentiment analysis and emotion recognition.
Contribution
The paper proposes a novel knowledge-based embedding enhancement model that improves BERT representations using automatically acquired lexical knowledge for better classification performance.
Findings
Enhanced embeddings lead to improved classification accuracy
Effective in small data and keyword-driven tasks
Outperforms baseline models on three classification tasks
Abstract
Pre-trained language models such as BERT have been proved to be powerful in many natural language processing tasks. But in some text classification applications such as emotion recognition and sentiment analysis, BERT may not lead to satisfactory performance. This often happens in applications where keywords play critical roles in the prediction of class labels. Our investigation found that the root cause of the problem is that the context-based BERT embedding of the keywords may not be discriminative enough to produce discriminative text representation for classification. Motivated by this finding, we develop a method to enhance word embeddings using domain-specific lexical knowledge. The knowledge-based embedding enhancement model projects the BERT embedding into a new space where within-class similarity and between-class difference are maximized. To implement the knowledge-based word…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
