Theory of Machine Learning with Limited Data

Marina Sapir

arXiv:2206.07586·cs.AI·January 3, 2023

Theory of Machine Learning with Limited Data

Marina Sapir

PDF

Open Access

TL;DR

This paper formalizes the concept of abduction in machine learning for real-valued hypotheses and demonstrates that many popular ML algorithms implement this inference, offering an alternative to traditional statistical learning theory.

Contribution

It introduces a formal framework for abduction in ML and shows that common algorithms inherently perform this type of inference, challenging the reliance on statistical learning theory.

Findings

01

14 popular ML algorithms implement abduction inference

02

The approach applies across classification, regression, and clustering

03

Provides an alternative justification to statistical learning theory

Abstract

Application of machine learning may be understood as deriving new knowledge for practical use through explaining accumulated observations, training set. Peirce used the term abduction for this kind of inference. Here I formalize the concept of abduction for real valued hypotheses, and show that 14 of the most popular textbook ML learners (every learner I tested), covering classification, regression and clustering, implement this concept of abduction inference. The approach is proposed as an alternative to statistical learning theory, which requires an impractical assumption of indefinitely increasing training set for its justification.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Computational Physics and Python Applications · Statistics Education and Methodologies

Methodsk-Nearest Neighbors