A Kernel Test of Goodness of Fit

Kacper Chwialkowski; Heiko Strathmann; Arthur Gretton

arXiv:1602.02964·stat.ML·September 28, 2016·ICML·102 cites

A Kernel Test of Goodness of Fit

Kacper Chwialkowski, Heiko Strathmann, Arthur Gretton

PDF

Open Access 1 Repo

TL;DR

This paper introduces a nonparametric goodness-of-fit test using kernel methods and Stein's divergence, applicable to i.i.d. and dependent samples, with practical applications in MCMC convergence and model criticism.

Contribution

It develops a novel kernel-based divergence measure for goodness-of-fit testing, with a bootstrap-based null distribution estimation applicable to various data dependencies.

Findings

01

Effective in quantifying MCMC convergence

02

Applicable to model criticism and density estimation

03

Uses wild bootstrap for null distribution estimation

Abstract

We propose a nonparametric statistical test for goodness-of-fit: given a set of samples, the test determines how likely it is that these were generated from a target density function. The measure of goodness-of-fit is a divergence constructed via Stein's method using functions from a Reproducing Kernel Hilbert Space. Our test statistic is based on an empirical estimate of this divergence, taking the form of a V-statistic in terms of the log gradients of the target density and the kernel. We derive a statistical test, both for i.i.d. and non-i.i.d. samples, where we estimate the null distribution quantiles using a wild bootstrap procedure. We apply our test to quantifying convergence of approximate Markov Chain Monte Carlo methods, statistical model criticism, and evaluating quality of fit vs model complexity in nonparametric density estimation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

karlnapf/kernel_goodness_of_fit
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMarkov Chains and Monte Carlo Methods · Statistical Methods and Inference · Statistical Methods and Bayesian Inference