Testing Consistency of Two Histograms

Frank C. Porter

arXiv:0804.0380·physics.data-an·April 3, 2008·30 cites

Testing Consistency of Two Histograms

Frank C. Porter

PDF

Open Access 2 Repos

TL;DR

This paper investigates various statistical tests for determining if two histograms originate from the same distribution, highlighting the adaptation of single-sample tests and the importance of Monte Carlo simulations.

Contribution

It compares the performance of multiple tests for histogram similarity and discusses the challenges of null hypothesis specification and probability estimation.

Findings

01

No single test is universally best across all scenarios

02

Adapting single-sample tests to two-sample histogram data is feasible

03

Monte Carlo simulations are crucial for accurate probability estimation

Abstract

Several approaches to testing the hypothesis that two histograms are drawn from the same distribution are investigated. We note that single-sample continuous distribution tests may be adapted to this two-sample grouped data situation. The difficulty of not having a fully-specified null hypothesis is an important consideration in the general case, and care is required in estimating probabilities with ``toy'' Monte Carlo simulations. The performance of several common tests is compared; no single test performs best in all situations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Methods and Models · Statistical Methods and Bayesian Inference · Advanced Statistical Process Monitoring