Chi-square and Poissonian Data: Biases Even in the High-Count Regime and How to Avoid them
Philip J. Humphrey, Wenhao Liu, David A. Buote (UC Irvine)

TL;DR
This paper shows that common chi-square approximations in astronomical data fitting can cause biases even at high counts, recommending the use of Cash's C-statistic for unbiased parameter estimation.
Contribution
It demonstrates the biases of chi-square approximations in high-count regimes and advocates for using the C-statistic to avoid these biases in Poisson data fitting.
Findings
Chi-square approximations can cause significant biases in parameter estimates.
Cash's C-statistic provides more unbiased estimates at high counts.
Biases can impact satellite calibration and galaxy cluster measurements.
Abstract
We demonstrate that two approximations to the chi^2 statistic as popularly employed by observational astronomers for fitting Poisson-distributed data can give rise to intrinsically biased model parameter estimates, even in the high counts regime, unless care is taken over the parameterization of the problem. For a small number of problems, previous studies have shown that the fractional bias introduced by these approximations is often small when the counts are high. However, we show that for a broad class of problem, unless the number of data bins is far smaller than \sqrt{N_c}, where N_c is the total number of counts in the dataset, the bias will still likely be comparable to, or even exceed, the statistical error. Conversely, we find that fits using Cash's C-statistic give comparatively unbiased parameter estimates when the counts are high. Taking into account their well-known…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
