Turing-Universal Learners with Optimal Scaling Laws

Preetum Nakkiran

arXiv:2111.05321·cs.LG·November 10, 2021

Turing-Universal Learners with Optimal Scaling Laws

Preetum Nakkiran

PDF

Open Access

TL;DR

This paper introduces a theoretical universal learning algorithm that achieves optimal distribution-dependent convergence rates within a specified runtime, extending Levin's universal search to learning theory.

Contribution

It presents a universal learner that attains the best possible asymptotic rates for all distributions within a fixed runtime, independent of the distribution.

Findings

01

Achieves optimal power-law convergence rates for all distributions.

02

Operates within a polynomial runtime with polylogarithmic slowdown.

03

Is a theoretical construct extending Levin's universal search.

Abstract

For a given distribution, learning algorithm, and performance metric, the rate of convergence (or data-scaling law) is the asymptotic behavior of the algorithm's test performance as a function of number of train samples. Many learning methods in both theory and practice have power-law rates, i.e. performance scales as $n^{- α}$ for some $α > 0$ . Moreover, both theoreticians and practitioners are concerned with improving the rates of their learning algorithms under settings of interest. We observe the existence of a "universal learner", which achieves the best possible distribution-dependent asymptotic rate among all learning algorithms within a specified runtime (e.g. $O (n^{2})$ ), while incurring only polylogarithmic slowdown over this runtime. This algorithm is uniform, and does not depend on the distribution, and yet achieves best-possible rates for all distributions. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputability, Logic, AI Algorithms · Machine Learning and Algorithms · Algorithms and Data Compression