Combining Self-labeling with Selective Sampling
J\k{e}drzej Kozal, Micha{\l} Wo\'zniak

TL;DR
This paper introduces a novel semi-supervised learning approach that combines self-labeling with selective sampling using an ensemble classifier, improving performance while addressing bias issues in label distribution.
Contribution
It proposes a new ensemble-based method that integrates self-labeling with active learning, including mechanisms to mitigate bias in class distribution.
Findings
Method matches current selective sampling techniques in performance.
Proposed mechanisms effectively reduce bias in self-labeling.
Experimental results show improved or comparable accuracy.
Abstract
Since data is the fuel that drives machine learning models, and access to labeled data is generally expensive, semi-supervised methods are constantly popular. They enable the acquisition of large datasets without the need for too many expert labels. This work combines self-labeling techniques with active learning in a selective sampling scenario. We propose a new method that builds an ensemble classifier. Based on an evaluation of the inconsistency of the decisions of the individual base classifiers for a given observation, a decision is made on whether to request a new label or use the self-labeling. In preliminary studies, we show that naive application of self-labeling can harm performance by introducing bias towards selected classes and consequently lead to skewed class distribution. Hence, we also propose mechanisms to reduce this phenomenon. Experimental evaluation shows that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Imbalanced Data Classification Techniques
MethodsBalanced Selection
