Topics and Label Propagation: Best of Both Worlds for Weakly Supervised Text Classification
Sachin Pawar, Nitin Ramrakhiyani, Swapnil Hingmire, Girish K., Palshikar

TL;DR
This paper introduces a weakly supervised text classification method combining label propagation and topic modeling, requiring minimal supervision and achieving high accuracy by enriching document graphs with topic nodes.
Contribution
It presents a novel approach that integrates label propagation with topic modeling to reduce supervision in text classification tasks.
Findings
Effective on various datasets
Outperforms state-of-the-art weak supervision methods
Requires only minimal topic labels
Abstract
We propose a Label Propagation based algorithm for weakly supervised text classification. We construct a graph where each document is represented by a node and edge weights represent similarities among the documents. Additionally, we discover underlying topics using Latent Dirichlet Allocation (LDA) and enrich the document graph by including the topics in the form of additional nodes. The edge weights between a topic and a text document represent level of "affinity" between them. Our approach does not require document level labelling, instead it expects manual labels only for topic nodes. This significantly minimizes the level of supervision needed as only a few topics are observed to be enough for achieving sufficiently high accuracy. The Label Propagation Algorithm is employed on this enriched graph to propagate labels among the nodes. Our approach combines the advantages of Label…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
