Data-Space Validation of High-Dimensional Models by Comparing Sample Quantiles
Stephen Thorp, Hiranya V. Peiris, Daniel J. Mortlock, Justin Alsing,, Boris Leistedt, Sinan Deger

TL;DR
This paper introduces a straightforward method for evaluating high-dimensional models by comparing predicted and observed quantiles in data space, using projection techniques for high-dimensional observables, demonstrated on galaxy photometry data.
Contribution
It proposes a novel quantile comparison approach for high-dimensional model validation, especially effective with sample-based predictions and observational data.
Findings
Validated the method on galaxy photometry data
Demonstrated effectiveness with high-dimensional observables
Applicable to non-parametric population models
Abstract
We present a simple method for assessing the predictive performance of high-dimensional models directly in data space when only samples are available. Our approach is to compare the quantiles of observables predicted by a model to those of the observables themselves. In cases where the dimensionality of the observables is large (e.g. multiband galaxy photometry), we advocate that the comparison is made after projection onto a set of principal axes to reduce the dimensionality. We demonstrate our method on a series of two-dimensional examples. We then apply it to results from a state-of-the-art generative model for galaxy photometry (pop-cosmos; arXiv:2402.00935) that generates predictions of colors and magnitudes by forward simulating from a 16-dimensional distribution of physical parameters represented by a score-based diffusion model. We validate the predictive performance of this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical and numerical algorithms · Soil Geostatistics and Mapping
