Joint Optimization of Neural Autoregressors via Scoring rules

Jonas Landsgesell

arXiv:2601.05683·cond-mat.soft·January 12, 2026

Joint Optimization of Neural Autoregressors via Scoring rules

Jonas Landsgesell

PDF

Open Access

TL;DR

This paper introduces a novel method for optimizing neural autoregressors using scoring rules, addressing the challenge of extending grid-based non-parametric distributional regression to multivariate settings with improved scalability and reduced overfitting.

Contribution

The paper proposes a new joint optimization approach for neural autoregressors that leverages scoring rules to handle multivariate distributions more efficiently.

Findings

01

Enhanced scalability in multivariate distribution modeling

02

Reduced overfitting in low-data regimes

03

Improved performance on benchmark tasks

Abstract

Non-parametric distributional regression has achieved significant milestones in recent years. Among these, the Tabular Prior-Data Fitted Network (TabPFN) has demonstrated state-of-the-art performance on various benchmarks. However, a challenge remains in extending these grid-based approaches to a truly multivariate setting. In a naive non-parametric discretization with $N$ bins per dimension, the complexity of an explicit joint grid scales exponentially and the paramer count of the neural networks rise sharply. This scaling is particularly detrimental in low-data regimes, as the final projection layer would require many parameters, leading to severe overfitting and intractability.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Statistical Methods and Inference