Valid Feature-Level Inference for Tabular Foundation Models via the Conditional Randomization Test
Mohamed Salem

TL;DR
This paper introduces a practical method combining the Conditional Randomization Test with a probabilistic foundation model to perform valid feature-level hypothesis testing on tabular data, even in complex nonlinear and correlated scenarios.
Contribution
It presents a novel approach that provides finite-sample valid p-values for feature relevance without retraining models or making parametric assumptions.
Findings
Provides valid p-values for feature relevance in complex settings
Works with nonlinear and correlated data without retraining
No parametric assumptions required
Abstract
Modern machine learning models are highly expressive but notoriously difficult to analyze statistically. In particular, while black-box predictors can achieve strong empirical performance, they rarely provide valid hypothesis tests or p-values for assessing whether individual features contain information about a target variable. This article presents a practical approach to feature-level hypothesis testing that combines the Conditional Randomization Test (CRT) with TabPFN, a probabilistic foundation model for tabular data. The resulting procedure yields finite-sample valid p-values for conditional feature relevance, even in nonlinear and correlated settings, without requiring model retraining or parametric assumptions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Explainable Artificial Intelligence (XAI) · Gaussian Processes and Bayesian Inference
