Bayesian Neural Scaling Law Extrapolation with Prior-Data Fitted Networks
Dongwoo Lee, Dong Bok Lee, Steven Adriaensen, Juho Lee, Sung Ju Hwang, Frank Hutter, Seon Joo Kim, Hae Beom Lee

TL;DR
This paper introduces a Bayesian approach using Prior-data Fitted Networks to improve neural scaling law extrapolation, providing uncertainty quantification and better performance in data-limited scenarios.
Contribution
It develops a novel Bayesian framework with a specially designed prior, enabling reliable uncertainty-aware extrapolation of neural scaling laws.
Findings
Outperforms existing point estimation methods in neural scaling law extrapolation.
Provides meaningful uncertainty estimates for predictions.
Excels in data-limited scenarios like Bayesian active learning.
Abstract
Scaling has been a major driver of recent advancements in deep learning. Numerous empirical studies have found that scaling laws often follow the power-law and proposed several variants of power-law functions to predict the scaling behavior at larger scales. However, existing methods mostly rely on point estimation and do not quantify uncertainty, which is crucial for real-world applications involving decision-making problems such as determining the expected performance improvements achievable by investing additional computational resources. In this work, we explore a Bayesian framework based on Prior-data Fitted Networks (PFNs) for neural scaling law extrapolation. Specifically, we design a prior distribution that enables the sampling of infinitely many synthetic functions resembling real-world neural scaling laws, allowing our PFN to meta-learn the extrapolation. We validate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications
