Learning the hypotheses space from data through a U-curve algorithm

Diego Marcondes; Adilson Simonis; Junior Barrera

arXiv:2109.03866·stat.ML·October 12, 2021·1 cites

Learning the hypotheses space from data through a U-curve algorithm

Diego Marcondes, Adilson Simonis, Junior Barrera

PDF

Open Access

TL;DR

This paper introduces a data-driven, systematic approach to model selection using a poset of hypothesis subspaces, enabling implicit regularization and potentially better hypothesis estimation with high computational power.

Contribution

It proposes a novel framework and algorithm for model selection within a hypothesis space poset, extending classical PAC learning and emphasizing computational capacity.

Findings

01

A general learning algorithm for implicit regularization in model selection.

02

Conditions where non-exhaustive search yields optimal solutions.

03

High computational power can compensate for limited data.

Abstract

This paper proposes a data-driven systematic, consistent and non-exhaustive approach to Model Selection, that is an extension of the classical agnostic PAC learning model. In this approach, learning problems are modeled not only by a hypothesis space $H$ , but also by a Learning Space $L (H)$ , a poset of subspaces of $H$ , which covers $H$ and satisfies a property regarding the VC dimension of related subspaces, that is a suitable algebraic search space for Model Selection algorithms. Our main contributions are a data-driven general learning algorithm to perform implicitly regularized Model Selection on $L (H)$ and a framework under which one can, theoretically, better estimate a target hypothesis with a given sample size by properly modeling $L (H)$ and employing high computational power. A remarkable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Machine Learning in Healthcare