Large Language Models as Computable Approximations to Solomonoff Induction

Jun Wan; Lingrui Mei

arXiv:2505.15784·cs.LG·May 22, 2025

Large Language Models as Computable Approximations to Solomonoff Induction

Jun Wan, Lingrui Mei

PDF

Open Access

TL;DR

This paper establishes a theoretical framework connecting large language models to Solomonoff induction via Algorithmic Information Theory, explaining their success and guiding improved few-shot learning strategies.

Contribution

It provides the first formal link between LLMs and Solomonoff induction, unifying explanations for emergent phenomena and proposing a new example selection method.

Findings

01

The training process approximates Solomonoff prior through loss minimization.

02

Next-token prediction implements approximate Solomonoff induction.

03

The proposed example selection improves performance, especially for smaller models.

Abstract

The rapid advancement of large language models (LLMs) calls for a rigorous theoretical framework to explain their empirical success. While significant progress has been made in understanding LLM behaviors, existing theoretical frameworks remain fragmented in explaining emergent phenomena through a unified mathematical lens. We establish the first formal connection between LLM architectures and Algorithmic Information Theory (AIT) by proving two fundamental results: (1) the training process computationally approximates Solomonoff prior through loss minimization interpreted as program length optimization, and (2) next-token prediction implements approximate Solomonoff induction. We leverage AIT to provide a unified theoretical explanation for in-context learning, few-shot learning, and scaling laws. Furthermore, our theoretical insights lead to a principled method for few-shot example…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputability, Logic, AI Algorithms