Logic of Machine Learning

Marina Sapir

arXiv:2006.09500·cs.LG·January 28, 2022

Logic of Machine Learning

Marina Sapir

PDF

Open Access

TL;DR

This paper introduces a logical framework based on the concept of incongruity to explain how machine learning models predict from finite samples, emphasizing the role of beliefs about predictability and formalizing errors.

Contribution

It proposes the modal Logic of Observations and Hypotheses (LOH) to formalize prediction errors and demonstrates its application across various machine learning algorithms.

Findings

01

Popular learners minimize their version of incongruity.

02

Incongruity formalizes errors in prediction.

03

Framework extends to other data analysis problems.

Abstract

The main question is: why and how can we ever predict based on a finite sample? The question is not answered by statistical learning theory. Here, I suggest that prediction requires belief in "predictability" of the underlying dependence, and learning involves search for a hypothesis where these beliefs are violated the least given the observations. The measure of these violations ("errors") for given data, hypothesis and particular type of predictability beliefs is formalized as concept of incongruity in modal Logic of Observations and Hypotheses (LOH). I show on examples of many popular textbook learners (from hierarchical clustering to k-NN and SVM) that each of them minimizes its own version of incongruity. In addition, the concept of incongruity is shown to be flexible enough for formalization of some important data analysis problems, not considered as part of ML.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRough Sets and Fuzzy Logic · Logic, Reasoning, and Knowledge · Advanced Database Systems and Queries

Methodsk-Nearest Neighbors · Support Vector Machine