Benchmarking Tabular Foundation Models for Conditional Density Estimation in Regression
Rafael Izbicki, Pedro L. C. Rodrigues

TL;DR
This paper systematically evaluates tabular foundation models like TabPFN and TabICL for conditional density estimation, demonstrating their strong performance across diverse datasets and sizes, and highlighting their potential as off-the-shelf CDE tools.
Contribution
It provides the first comprehensive benchmarking of tabular foundation models for CDE, comparing them to traditional methods across multiple metrics and datasets.
Findings
Foundation models outperform most baselines in density accuracy and likelihood.
Calibration is competitive at small sample sizes but varies at larger sizes.
TabPFN outperforms all baselines in a photometric redshift case study.
Abstract
Conditional density estimation (CDE) - recovering the full conditional distribution of a response given tabular covariates - is essential in settings with heteroscedasticity, multimodality, or asymmetric uncertainty. Recent tabular foundation models, such as TabPFN and TabICL, naturally produce predictive distributions, but their effectiveness as general-purpose CDE methods has not been systematically evaluated, unlike their performance for point prediction, which is well studied. We benchmark three tabular foundation model variants against a diverse set of parametric, tree-based, and neural CDE baselines on 39 real-world datasets, across training sizes from 50 to 20,000, using six metrics covering density accuracy, calibration, and computation time. Across all sample sizes, foundation models achieve the best CDE loss, log-likelihood, and CRPS on the large majority of datasets tested.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
