TL;DR
This paper introduces a kernel-based, nearest-neighbor method for testing differences between two conditional distributions, with applications in calibration, regression, and simulation-based inference, offering consistent, efficient, and versatile statistical tests.
Contribution
It proposes a novel kernel-based measure and a nearly linear time estimator for conditional distribution comparison, with a resampling test that controls Type I error and is applicable to various modern statistical problems.
Findings
The method accurately detects differences in conditional distributions in simulations.
It effectively assesses neural network calibration on CIFAR-10 data.
The approach successfully compares regression functions and validates emulator models.
Abstract
In this paper we introduce a kernel-based measure for detecting differences between two conditional distributions. Using the `kernel trick' and nearest-neighbor graphs, we propose a consistent estimate of this measure which can be computed in nearly linear time (for a fixed number of nearest neighbors). Moreover, when the two conditional distributions are the same, the estimate has a Gaussian limit and its asymptotic variance has a simple form that can be easily estimated from the data. The resulting test attains precise asymptotic level and is universally consistent for detecting differences between two conditional distributions. We also provide a resampling based test using our estimate that applies to the conditional goodness-of-fit problem, which controls Type I error in finite samples and is asymptotically consistent with only a finite number of resamples. A method to de-randomize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
