FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification
Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang

TL;DR
FastClass is a novel weakly-supervised text classification method that efficiently retrieves relevant documents using dense representations, reducing reliance on detailed class descriptions and significantly speeding up training while maintaining or improving accuracy.
Contribution
The paper introduces FastClass, a new approach that leverages dense text representations for efficient, less description-dependent weakly-supervised classification.
Findings
Outperforms keyword-driven models in accuracy
Achieves orders-of-magnitude faster training
Requires less detailed class descriptions
Abstract
Weakly-supervised text classification aims to train a classifier using only class descriptions and unlabeled data. Recent research shows that keyword-driven methods can achieve state-of-the-art performance on various tasks. However, these methods not only rely on carefully-crafted class descriptions to obtain class-specific keywords but also require substantial amount of unlabeled data and takes a long time to train. This paper proposes FastClass, an efficient weakly-supervised classification approach. It uses dense text representation to retrieve class-relevant documents from external unlabeled corpus and selects an optimal subset to train a classifier. Compared to keyword-driven methods, our approach is less reliant on initial class descriptions as it no longer needs to expand each class description into a set of class-specific keywords. Experiments on a wide range of classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Web Data Mining and Analysis
