TL;DR
This paper introduces a novel active learning approach for text classification that synthesizes useful membership queries from a small labeled set, reducing the need for extensive human labeling.
Contribution
It presents the first framework for generating textual membership queries using modification operators and search algorithms, enhancing classifier performance with minimal labeled data.
Findings
Improved classifier accuracy with synthesized queries
Effective use of modification operators in text domain
First application of membership queries in text classification
Abstract
Human labeling of data can be very time-consuming and expensive, yet, in many cases it is critical for the success of the learning process. In order to minimize human labeling efforts, we propose a novel active learning solution that does not rely on existing sources of unlabeled data. It uses a small amount of labeled data as the core set for the synthesis of useful membership queries (MQs) - unlabeled instances generated by an algorithm for human labeling. Our solution uses modification operators, functions that modify instances to some extent. We apply the operators on a small set of instances (core set), creating a set of new membership queries. Using this framework, we look at the instance space as a search space and apply search algorithms in order to generate new examples highly relevant to the learner. We implement this framework in the textual domain and test it on several text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
