Rethinking Open-World Semi-Supervised Learning: Distribution Mismatch   and Inductive Inference

Seongheon Park; Hyuk Kwon; Kwanghoon Sohn; Kibok Lee

arXiv:2405.20829·cs.CV·June 3, 2024

Rethinking Open-World Semi-Supervised Learning: Distribution Mismatch and Inductive Inference

Seongheon Park, Hyuk Kwon, Kwanghoon Sohn, Kibok Lee

PDF

Open Access

TL;DR

This paper challenges existing assumptions in open-world semi-supervised learning, proposing new training and evaluation strategies to better handle real-world distribution mismatches and inductive inference.

Contribution

It introduces a generalized framework for OWSSL that relaxes common assumptions, emphasizing the need for different training and evaluation methods.

Findings

01

Existing methods rely on shared class priors, which often do not hold in practice.

02

Evaluation should be inductive, not transductive, for real-world applications.

03

Addressing distribution mismatch improves OWSSL robustness.

Abstract

Open-world semi-supervised learning (OWSSL) extends conventional semi-supervised learning to open-world scenarios by taking account of novel categories in unlabeled datasets. Despite the recent advancements in OWSSL, the success often relies on the assumptions that 1) labeled and unlabeled datasets share the same balanced class prior distribution, which does not generally hold in real-world applications, and 2) unlabeled training datasets are utilized for evaluation, where such transductive inference might not adequately address challenges in the wild. In this paper, we aim to generalize OWSSL by addressing them. Our work suggests that practical OWSSL may require different training settings, evaluation methods, and learning strategies compared to those prevalent in the existing literature.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Domain Adaptation and Few-Shot Learning · Gaussian Processes and Bayesian Inference

MethodsTransductive Inference