Weakly-supervised Text Classification Based on Keyword Graph
Lu Zhang, Jiandong Ding, Yi Xu, Yingyao Liu, Shuigeng Zhou

TL;DR
This paper introduces ClassKG, a weakly-supervised text classification framework that leverages keyword correlations via GNNs, improving pseudo-labeling and classification accuracy.
Contribution
It proposes a novel keyword graph approach with GNNs and self-supervised pretraining to exploit keyword correlations for better weakly-supervised text classification.
Findings
Outperforms existing methods on various datasets
Effectively models keyword correlations with GNNs
Improves pseudo-label quality through self-supervised pretraining
Abstract
Weakly-supervised text classification has received much attention in recent years for it can alleviate the heavy burden of annotating massive data. Among them, keyword-driven methods are the mainstream where user-provided keywords are exploited to generate pseudo-labels for unlabeled texts. However, existing methods treat keywords independently, thus ignore the correlation among them, which should be useful if properly exploited. In this paper, we propose a novel framework called ClassKG to explore keyword-keyword correlation on keyword graph by GNN. Our framework is an iterative process. In each iteration, we first construct a keyword graph, so the task of assigning pseudo labels is transformed to annotating keyword subgraphs. To improve the annotation quality, we introduce a self-supervised task to pretrain a subgraph annotator, and then finetune it. With the pseudo labels generated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Advanced Text Analysis Techniques · Topic Modeling
