Active learning for binary classification with variable selection
Zhanfeng Wang, Yumi Kwon, Yuan-chin Ivan Chang

TL;DR
This paper introduces an active learning framework combined with variable selection for binary classification, enabling efficient model building when label information is initially unavailable and data labeling is costly.
Contribution
It proposes a novel model-based active learning method integrated with sequential variable selection for binary classification with limited initial label data.
Findings
The method effectively reduces labeling costs in large datasets.
Theoretical analysis supports the convergence and efficiency of the proposed procedure.
Numerical experiments demonstrate improved classification performance with fewer labeled samples.
Abstract
Modern computing and communication technologies can make data collection procedures very efficient. However, our ability to analyze large data sets and/or to extract information out from them is hard-pressed to keep up with our capacities for data collection. Among these huge data sets, some of them are not collected for any particular research purpose. For a classification problem, this means that the essential label information may not be readily obtainable, in the data set in hands, and an extra labeling procedure is required such that we can have enough label information to be used for constructing a classification model. When the size of a data set is huge, to label each subject in it will cost a lot in both capital and time. Thus, it is an important issue to decide which subjects should be labeled first in order to efficiently reduce the training cost/time. Active learning method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Algorithms and Data Compression · Machine Learning and Data Classification
