Statistical Foundations of Prior-Data Fitted Networks
Thomas Nagler

TL;DR
Prior-data fitted networks (PFNs) are a new machine learning paradigm that pre-train on simulated data and perform well on varied tasks, with their behavior explained through statistical analysis of their variance and bias.
Contribution
This paper provides a theoretical foundation for PFNs, explaining their empirical success through statistical mechanisms and analyzing their bias and variance properties.
Findings
PFNs achieve state-of-the-art performance on tasks similar to pre-training data.
Their accuracy improves with larger inference data sets.
Variance vanishes when sensitivity to individual samples is zero, bias vanishes when localized around test features.
Abstract
Prior-data fitted networks (PFNs) were recently proposed as a new paradigm for machine learning. Instead of training the network to an observed training set, a fixed model is pre-trained offline on small, simulated training sets from a variety of tasks. The pre-trained model is then used to infer class probabilities in-context on fresh training sets with arbitrary size and distribution. Empirically, PFNs achieve state-of-the-art performance on tasks with similar size to the ones used in pre-training. Surprisingly, their accuracy further improves when passed larger data sets during inference. This article establishes a theoretical foundation for PFNs and illuminates the statistical mechanisms governing their behavior. While PFNs are motivated by Bayesian ideas, a purely frequentistic interpretation of PFNs as pre-tuned, but untrained predictors explains their behavior. A predictor's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Gaussian Processes and Bayesian Inference · Machine Learning and Data Classification
MethodsTest
