Theory of Machine Learning with Limited Data
Marina Sapir

TL;DR
This paper formalizes the concept of abduction in machine learning for real-valued hypotheses and demonstrates that many popular ML algorithms implement this inference, offering an alternative to traditional statistical learning theory.
Contribution
It introduces a formal framework for abduction in ML and shows that common algorithms inherently perform this type of inference, challenging the reliance on statistical learning theory.
Findings
14 popular ML algorithms implement abduction inference
The approach applies across classification, regression, and clustering
Provides an alternative justification to statistical learning theory
Abstract
Application of machine learning may be understood as deriving new knowledge for practical use through explaining accumulated observations, training set. Peirce used the term abduction for this kind of inference. Here I formalize the concept of abduction for real valued hypotheses, and show that 14 of the most popular textbook ML learners (every learner I tested), covering classification, regression and clustering, implement this concept of abduction inference. The approach is proposed as an alternative to statistical learning theory, which requires an impractical assumption of indefinitely increasing training set for its justification.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Computational Physics and Python Applications · Statistics Education and Methodologies
Methodsk-Nearest Neighbors
