Improved Training for Self-Training by Confidence Assessments
Gal Hyams, Daniel Greenfeld, Dor Bank

TL;DR
This paper proposes an improved self-training method that uses confidence assessments to leverage unlabeled data for training, applicable to classification and semantic segmentation tasks, reducing the need for extensive labeled datasets.
Contribution
It introduces a confidence-based approach for self-training that can be applied online to unlabeled data, enhancing learning efficiency in limited labeled data scenarios.
Findings
Effective on MNIST for classification
Successful semi-supervised segmentation on ADE20K
Improves training with minimal labeled data
Abstract
It is well known that for some tasks, labeled data sets may be hard to gather. Therefore, we wished to tackle here the problem of having insufficient training data. We examined learning methods from unlabeled data after an initial training on a limited labeled data set. The suggested approach can be used as an online learning method on the unlabeled test set. In the general classification task, whenever we predict a label with high enough confidence, we treat it as a true label and train the data accordingly. For the semantic segmentation task, a classic example for an expensive data labeling process, we do so pixel-wise. Our suggested approaches were applied on the MNIST data-set as a proof of concept for a vision classification task and on the ADE20K data-set in order to tackle the semi-supervised semantic segmentation problem.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Anomaly Detection Techniques and Applications
