Proper Learnability and the Role of Unlabeled Data

Julian Asilis; Siddartha Devic; Shaddin Dughmi; Vatsal Sharan; Shang-Hua Teng

arXiv:2502.10359·cs.LG·December 10, 2025

Proper Learnability and the Role of Unlabeled Data

Julian Asilis, Siddartha Devic, Shaddin Dughmi, Vatsal Sharan, Shang-Hua Teng

PDF

Open Access 4 Reviews

TL;DR

This paper investigates the conditions under which proper learning is feasible, demonstrating that unlabeled data offers limited benefits in worst-case scenarios and revealing fundamental limitations and undecidability issues in proper PAC learning.

Contribution

It introduces the distribution-fixed PAC model with distributional regularization and provides new impossibility results, including undecidability and non-monotonicity of proper learnability.

Findings

01

Unlabeled data only marginally reduces sample complexity in worst-case PAC learning.

02

Proper learnability can be undecidable and is not a monotone or local property.

03

Impossibility results hold even for multiclass classification, linking to EMX learning.

Abstract

Proper learning refers to the setting in which learners must emit predictors in the underlying hypothesis class $H$ , and often leads to learners with simple algorithmic forms (e.g. empirical risk minimization (ERM), structural risk minimization (SRM)). The limitation of proper learning, however, is that there exist problems which can only be learned improperly, e.g. in multiclass classification. Thus, we ask: Under what assumptions on the hypothesis class or the information provided to the learner is a problem properly learnable? We first demonstrate that when the unlabeled data distribution is given, there always exists an optimal proper learner governed by distributional regularization, a randomized generalization of regularization. We refer to this setting as the distribution-fixed PAC model, and continue to evaluate the learner on its worst-case performance over all distributions.…

Peer Reviews

Decision·ALT 2025

Reviewer 01Rating · AcceptConfidence 5

Reviewer 02Rating 7Confidence 4

Reviewer 03Rating 7Confidence 3

Reviewer 04Rating 7Confidence 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Assessment and Pedagogy