Active Measurement of Two-Point Correlations
Max Hamilton, Daniel Sheldon, Subhransu Maji

TL;DR
This paper introduces a human-in-the-loop framework that efficiently estimates two-point correlation functions in large astronomical datasets by combining adaptive sampling, a novel unbiased estimator, and confidence intervals.
Contribution
It presents a new scalable method that reduces annotation effort and variance in measuring two-point correlations, leveraging pre-trained classifiers and adaptive sampling strategies.
Findings
Achieves lower variance than Monte Carlo methods.
Reduces human annotation effort significantly.
Provides statistically rigorous confidence intervals.
Abstract
Two-point correlation functions (2PCF) are widely used to characterize how points cluster in space. In this work, we study the problem of measuring the 2PCF over a large set of points, restricted to a subset satisfying a property of interest. An example comes from astronomy, where scientists measure the 2PCF of star clusters, which make up only a tiny subset of possible sources within a galaxy. This task typically requires careful labeling of sources to construct catalogs, which is time-consuming. We present a human-in-the-loop framework for efficient estimation of 2PCF of target sources. By leveraging a pre-trained classifier to guide sampling, our approach adaptively selects the most informative points for human annotation. After each annotation, it produces unbiased estimates of pair counts across multiple distance bins simultaneously. Compared to simple Monte Carlo approaches, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
