A Simplicity Bubble Problem in Formal-Theoretic Learning Systems

Felipe S. Abrah\~ao; Hector Zenil; Fabio Porto; Michael Winter; Klaus; Wehmuth; Itala M. L. D'Ottaviano

arXiv:2112.12275·cs.IT·April 26, 2023·1 cites

A Simplicity Bubble Problem in Formal-Theoretic Learning Systems

Felipe S. Abrah\~ao, Hector Zenil, Fabio Porto, Michael Winter, Klaus, Wehmuth, Itala M. L. D'Ottaviano

PDF

Open Access

TL;DR

This paper reveals a fundamental limitation of current machine learning methods, showing that large datasets can deceive algorithms into favoring simple, locally optimal solutions over complex, globally optimal ones, due to a 'simplicity bubble' phenomenon.

Contribution

It introduces the concept of a 'simplicity bubble' in formal-theoretic learning systems, demonstrating how large datasets can mislead algorithms into suboptimal solutions, and suggests a shift towards algorithmic information theory-based learning.

Findings

01

Large datasets can deceive learning algorithms into favoring simple solutions.

02

A 'simplicity bubble' phenomenon causes divergence from globally optimal solutions.

03

Proposes moving from statistical to algorithmic information theory-based learning.

Abstract

When mining large datasets in order to predict new data, limitations of the principles behind statistical machine learning pose a serious challenge not only to the Big Data deluge, but also to the traditional assumptions that data generating processes are biased toward low algorithmic complexity. Even when one assumes an underlying algorithmic-informational bias toward simplicity in finite dataset generators, we show that current approaches to machine learning (including deep learning, or any formal-theoretic hybrid mix of top-down AI and statistical machine learning approaches), can always be deceived, naturally or artificially, by sufficiently large datasets. In particular, we demonstrate that, for every learning algorithm (with or without access to a formal theory), there is a sufficiently large dataset size above which the algorithmic probability of an unpredictable deceiver is an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputability, Logic, AI Algorithms · Machine Learning and Algorithms · Machine Learning and Data Classification