Joint Optimization of Neural Autoregressors via Scoring rules
Jonas Landsgesell

TL;DR
This paper introduces a novel method for optimizing neural autoregressors using scoring rules, addressing the challenge of extending grid-based non-parametric distributional regression to multivariate settings with improved scalability and reduced overfitting.
Contribution
The paper proposes a new joint optimization approach for neural autoregressors that leverages scoring rules to handle multivariate distributions more efficiently.
Findings
Enhanced scalability in multivariate distribution modeling
Reduced overfitting in low-data regimes
Improved performance on benchmark tasks
Abstract
Non-parametric distributional regression has achieved significant milestones in recent years. Among these, the Tabular Prior-Data Fitted Network (TabPFN) has demonstrated state-of-the-art performance on various benchmarks. However, a challenge remains in extending these grid-based approaches to a truly multivariate setting. In a naive non-parametric discretization with bins per dimension, the complexity of an explicit joint grid scales exponentially and the paramer count of the neural networks rise sharply. This scaling is particularly detrimental in low-data regimes, as the final projection layer would require many parameters, leading to severe overfitting and intractability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Statistical Methods and Inference
