Testing Consistency of Two Histograms
Frank C. Porter

TL;DR
This paper investigates various statistical tests for determining if two histograms originate from the same distribution, highlighting the adaptation of single-sample tests and the importance of Monte Carlo simulations.
Contribution
It compares the performance of multiple tests for histogram similarity and discusses the challenges of null hypothesis specification and probability estimation.
Findings
No single test is universally best across all scenarios
Adapting single-sample tests to two-sample histogram data is feasible
Monte Carlo simulations are crucial for accurate probability estimation
Abstract
Several approaches to testing the hypothesis that two histograms are drawn from the same distribution are investigated. We note that single-sample continuous distribution tests may be adapted to this two-sample grouped data situation. The difficulty of not having a fully-specified null hypothesis is an important consideration in the general case, and care is required in estimating probabilities with ``toy'' Monte Carlo simulations. The performance of several common tests is compared; no single test performs best in all situations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Methods and Bayesian Inference · Advanced Statistical Process Monitoring
