Estimation of Goodness-of-Fit in Multidimensional Analysis Using Distance to Nearest Neighbor
Ilya Narsky

TL;DR
This paper introduces a novel approach for assessing the goodness-of-fit in multidimensional data analysis by examining nearest neighbor clusters, offering insights into data modeling issues and outperforming traditional methods in certain scenarios.
Contribution
The paper presents a new nearest neighbor-based method for goodness-of-fit estimation in multidimensional data, with demonstrated advantages over likelihood and Kolmogorov-Smirnov tests.
Findings
The new method effectively identifies data-model discrepancies.
It outperforms traditional goodness-of-fit tests in toy Monte Carlo studies.
Applied successfully to B->Kll analysis at BaBar.
Abstract
A new method for calculation of goodness of multidimensional fits in particle physics experiments is proposed. This method finds the smallest and largest clusters of nearest neighbors for observed data points. The cluster size is used to estimate the goodness-of-fit and the cluster location provides clues about possible problems with data modeling. The performance of the new method is compared to that of the likelihood method and Kolmogorov-Smirnov test using toy Monte Carlo studies. The new method is applied to estimate the goodness-of-fit in a B->Kll analysis at BaBar.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParticle physics theoretical and experimental studies · High-Energy Particle Collisions Research · Cosmology and Gravitation Theories
