Domain Lexical Knowledge-based Word Embedding Learning for Text Classification under Small Data

Zixiao Zhu; Kezhi Mao

arXiv:2506.01621·cs.CL·June 3, 2025

Domain Lexical Knowledge-based Word Embedding Learning for Text Classification under Small Data

Zixiao Zhu, Kezhi Mao

PDF

Open Access

TL;DR

This paper introduces a method to improve text classification by enhancing BERT embeddings with domain-specific lexical knowledge, especially effective in small data scenarios and tasks like sentiment analysis and emotion recognition.

Contribution

The paper proposes a novel knowledge-based embedding enhancement model that improves BERT representations using automatically acquired lexical knowledge for better classification performance.

Findings

01

Enhanced embeddings lead to improved classification accuracy

02

Effective in small data and keyword-driven tasks

03

Outperforms baseline models on three classification tasks

Abstract

Pre-trained language models such as BERT have been proved to be powerful in many natural language processing tasks. But in some text classification applications such as emotion recognition and sentiment analysis, BERT may not lead to satisfactory performance. This often happens in applications where keywords play critical roles in the prediction of class labels. Our investigation found that the root cause of the problem is that the context-based BERT embedding of the keywords may not be discriminative enough to produce discriminative text representation for classification. Motivated by this finding, we develop a method to enhance word embeddings using domain-specific lexical knowledge. The knowledge-based embedding enhancement model projects the BERT embedding into a new space where within-class similarity and between-class difference are maximized. To implement the knowledge-based word…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies