TL;DR
This paper derives formulas to determine the required measurement accuracy of selection functions using Monte-Carlo injections, ensuring unbiased population inference, with the number of injections scaling linearly with population size.
Contribution
It provides a mathematical framework linking injection measurement accuracy to unbiased population inference in selection functions.
Findings
Number of injections scales linearly with population size
Coefficient depends on injection and population distributions
Formulas enable planning of injection campaigns for unbiased results
Abstract
I give formulas for the accuracy to which a selection function must be measured via Monte-Carlo injections in order to have un-biased population inference. The number of found injections scales linearly with the number of objects in the population; the coefficient in front of the linear term depends on both the distribution of injections and the inferred population distribution.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Accuracy Requirements for Empirically-Measured Selection Functions
Department of Physics and Astronomy, Stony Brook University, Stony Brook NY 11794, United States
Center for Computational Astronomy, Flatiron Institute, New York NY 10010, United States
When conducting a population analysis on a catalog of objects the effect of the selection function must be incorporated to avoid so-called “Malmquist bias” (Malmquist, 1922; Loredo, 2004; Mandel et al., 2018). Suppose we have a catalog consisting of data , , that constrain the parameters of a set of objects. We wish infer the population distribution function
[TABLE]
which can depend on some population-level parameters . The joint posterior for the object-level parameters and population-level parameters is (Loredo, 2004; Mandel et al., 2018)
[TABLE]
is the likelihood function that describes the measurement process for the catalog, is a prior, and is the expected number of detections:
[TABLE]
represents the selection function; an observation will be included in the catalog if and only if it generates data such that . We factor an overall normalization out of the population distribution so that
[TABLE]
with the amplitude of fixed in some way; is the set of parameters that remain once the amplitude of the population distribution is fixed. In this re-parameterization, , where is given by
[TABLE]
If integrates to one over all , then is the fraction of sources from a population described by that are detectable.
In simple cases the integral in Eq. (5) can be evaluated analytically. But for most realistic applications it is not possible to analytically evaluate (see e.g. Burke et al., 2015; Christiansen et al., 2015; Abbott et al., 2016a, b; Burke & Catanzarite, 2017). Instead, the detection efficiency must be estimated by drawing synthetic objects from a fiducial distribution, , drawing corresponding data from the likelihood function , and “injecting” these data into the pipeline used to produce the catalog, recording which observations are detected (Tiwari, 2018). This procedure introduces uncertainty in the estimation of the selection integral; we must have enough draws that this uncertainty does not alter the shape of the posterior very much.
Given a set of detected objects with parameters , generated from a total number of draws the integral in Eq. (5) can be estimated via
[TABLE]
Under repeated samplings will follow an approximately normal distribution
[TABLE]
with
[TABLE]
and
[TABLE]
We have introduced the parameter that gives the effective number of independent draws that contribute to the estimate of .
Given a particular sampling of the selection function, we should marginalize over the uncertainty in . Eq. (2) becomes
[TABLE]
Integrating over , we obtain
[TABLE]
The divergence of this expression as reflects that the normal approximation permits non-zero probability of . Eq. (11) has stationary points in at
[TABLE]
Provided these stationary points will occur for real, positive . In this case, the stationary point at is a local maximum; at we have a minimum associated with the “unphysical” transition to the divergent behavior as . We have
[TABLE]
is the point estimate for the detection efficiency in Eq. (6). Near a normal approximation holds for the posterior as a function of with and
[TABLE]
Marginalizing the normal approximation over imposing a flat-in-log prior gives
[TABLE]
The term involving would appear in an analysis that ignores the rate and works entirely with population distributions (Mandel et al., 2018; Fishbach et al., 2018); the term involving is a correction to account for the uncertainty in our estimate of the selection integral.
The uncertainty in parameters is driven by the differences in the log-posterior. The -dependent terms contribute to such differences through
[TABLE]
Both derivatives are independent of , so the relative contribution of the second term to the parameter estimates is .
If becomes close to for any relevant set of population parameters then the posterior no longer peaks in and more injections must be obtained for an accurate analysis.
A worked example, along with the LaTeX source for this document, can be found at https://github.com/farr/SelectionAccuracy.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Abbott et al. (2016 a) Abbott, B. P., Abbott, R., Abbott, T. D., et al. 2016 a, Ap J, 833, L 1, doi: 10.3847/2041-8205/833/1/L 1 · doi ↗
- 2Abbott et al. (2016 b) —. 2016 b, The Astrophysical Journal Supplement Series, 227, 14, doi: 10.3847/0067-0049/227/2/14 · doi ↗
- 3Burke & Catanzarite (2017) Burke, C. J., & Catanzarite, J. 2017, Planet Detection Metrics: Per-Target Detection Contours for Data Release 25, Technical Report KSCI-19111-002, NASA Ames Research Center
- 4Burke et al. (2015) Burke, C. J., Christiansen, J. L., Mullally, F., et al. 2015, Ap J, 809, 8, doi: 10.1088/0004-637X/809/1/8 · doi ↗
- 5Christiansen et al. (2015) Christiansen, J. L., Clarke, B. D., Burke, C. J., et al. 2015, Ap J, 810, 95, doi: 10.1088/0004-637X/810/2/95 · doi ↗
- 6Fishbach et al. (2018) Fishbach, M., Holz, D. E., & Farr, W. M. 2018, Ap J, 863, L 41, doi: 10.3847/2041-8213/aad 800 · doi ↗
- 7Loredo (2004) Loredo, T. J. 2004, in American Institute of Physics Conference Series, ed. R. Fischer, R. Preuss, & U. V. Toussaint, Vol. 735, 195–206
- 8Malmquist (1922) Malmquist, K. G. 1922, Meddelanden fran Lunds Astronomiska Observatorium Serie I, 100, 1
