Model-Free Assessment of Simulator Fidelity via Quantile Curves

Garud Iyengar; Yu-Shiou Willy Lin; Kaizheng Wang

arXiv:2512.05024·stat.ME·April 17, 2026

Model-Free Assessment of Simulator Fidelity via Quantile Curves

Garud Iyengar, Yu-Shiou Willy Lin, Kaizheng Wang

PDF

TL;DR

This paper introduces a model-free, statistical approach to quantify the discrepancy between real and simulated systems across scenarios by constructing confidence sets for latent parameters and estimating their distributional risk profile.

Contribution

It develops a robust, model-agnostic method to assess simulator fidelity using quantile functions of a discrepancy proxy, applicable to diverse output types.

Findings

01

Effectively evaluates four major LLMs against human data.

02

Provides a distribution-level risk profile for simulator assessment.

03

Supports statistical inference and comparison across different simulators.

Abstract

As generative AI models are increasingly used to simulate real-world systems, quantifying the ``sim-to-real'' gap is critical. For each input setting of interest -- which we call a \emph{scenario}, such as a survey question or operating condition -- the real and simulated systems are associated with unobserved latent population parameters, and their discrepancy varies across scenarios. A fundamental challenge is that, for any given scenario, this discrepancy cannot be observed directly, since both systems are accessible only through finite samples, often of heterogeneous sizes across scenarios. Standard predictive inference methods are therefore ill-suited, as they quantify uncertainty in observable outputs rather than latent population parameters. To address this, we construct confidence sets for these latent parameters and use them to derive a robust proxy for the sim-to-real…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.