Towards Realistic Semi-Supervised Learning
Mamshad Nayeem Rizve, Navid Kardan, Mubarak Shah

TL;DR
This paper introduces a pseudo-label based semi-supervised learning method for open-world scenarios, effectively handling unlabeled data with known and unknown classes, and demonstrating superior performance on multiple benchmarks.
Contribution
It proposes a novel class-distribution-aware pseudo-labeling approach that improves open-world semi-supervised learning and addresses novel class discovery and data imbalance.
Findings
Outperforms state-of-the-art on seven datasets including CIFAR-100, ImageNet-100, and Tiny ImageNet.
Effectively handles unknown classes and imbalanced data.
Provides a technique to estimate the number of novel classes.
Abstract
Deep learning is pushing the state-of-the-art in many computer vision applications. However, it relies on large annotated data repositories, and capturing the unconstrained nature of the real-world data is yet to be solved. Semi-supervised learning (SSL) complements the annotated training data with a large corpus of unlabeled data to reduce annotation cost. The standard SSL approach assumes unlabeled data are from the same distribution as annotated data. Recently, a more realistic SSL problem, called open-world SSL, is introduced, where the unannotated data might contain samples from unknown classes. In this paper, we propose a novel pseudo-label based approach to tackle SSL in open-world setting. At the core of our method, we utilize sample uncertainty and incorporate prior knowledge about class distribution to generate reliable class-distribution-aware pseudo-labels for unlabeled data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Imbalanced Data Classification Techniques
