Cold Start Active Learning Strategies in the Context of Imbalanced Classification
Etienne Brangbour, Pierrick Bruneau, Thomas Tamisier and, St\'ephane Marchand-Maillet

TL;DR
This paper introduces new active learning strategies tailored for the cold start phase in imbalanced classification tasks, effectively improving minority class recall by leveraging clustering and label propagation.
Contribution
The paper proposes novel active learning methods that address cold start and class imbalance simultaneously using clustering and label propagation techniques.
Findings
Improved recall for minority classes in imbalanced datasets.
Effective handling of cold start in active learning scenarios.
Successful application to Twitter flood event data.
Abstract
We present novel active learning strategies dedicated to providing a solution to the cold start stage, i.e. initializing the classification of a large set of data with no attached labels. Moreover, proposed strategies are designed to handle an imbalanced context in which random selection is highly inefficient. Specifically, our active learning iterations address label scarcity and imbalance using element scores, combining information extracted from a clustering structure to a label propagation model. The strategy is illustrated by a case study on annotating Twitter content w.r.t. testimonies of a real flood event. We show that our method effectively copes with class imbalance, by boosting the recall of samples from the minority class.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Text and Document Classification Technologies · Spam and Phishing Detection
