Statistical learning theory and Occam's razor: The core argument
Tom F. Sterkenburg

TL;DR
This paper explains how statistical learning theory supports Occam's razor by showing that simpler models generally have better learning guarantees, but this preference is moderated by prior knowledge.
Contribution
It clarifies the core argument linking statistical learning guarantees to the principle of simplicity, emphasizing the role of prior knowledge.
Findings
Simpler hypothesis classes often lead to better learning guarantees.
The theoretical preference for simplicity is moderated by prior knowledge.
The core argument is based on the learning guarantees of empirical risk minimization.
Abstract
Statistical learning theory is often associated with the principle of Occam's razor, which recommends a simplicity preference in inductive inference. This paper distills the core argument for simplicity obtainable from statistical learning theory, built on the theory's central learning guarantee for the method of empirical risk minimization. This core "means-ends" argument is that a simpler hypothesis class or inductive model is better because it has better learning guarantees; however, these guarantees are model-relative and so the theoretical push towards simplicity is checked by our prior knowledge.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Computability, Logic, AI Algorithms
