Photometric Biases in Modern Surveys
Stephen K. N. Portillo, Joshua S. Speagle, and Douglas P. Finkbeiner

TL;DR
This paper demonstrates that maximum-likelihood photometric estimators systematically overestimate flux, especially for resolved sources and in multi-band fitting, with implications for survey data analysis and bias correction methods.
Contribution
It reveals the extent and behavior of ML flux biases in modern surveys and provides correction prescriptions for these biases.
Findings
ML estimators overestimate flux, especially for resolved sources.
Bias increases with lower signal-to-noise ratio and more model parameters.
Simulations and real survey data confirm the presence of these biases.
Abstract
Many surveys use maximum-likelihood (ML) methods to fit models when extracting photometry from images. We show these ML estimators systematically overestimate the flux as a function of the signal-to-noise ratio and the number of model parameters involved in the fit. This bias is substantially worse for resolved sources: while a 1% bias is expected for a 10 point source, a 10 resolved galaxy with a simplified Gaussian profile suffers a 2.5% bias. This bias also behaves differently depending how multiple bands are used in the fit: simultaneously fitting all bands leads the flux bias to become roughly evenly distributed between them, while fixing the position in "non-detection" bands (i.e. forced photometry) gives flux estimates in those bands that are biased low, compounding a bias in derived colors. We show that these effects are present in idealized simulations, outputs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Photometric Biases in Modern Surveys
Equal contribution
DIRAC Institute, Department of Astronomy
University of Washington
3910 15th Ave NE
Seattle, WA 98195, USA
Equal contribution
Center for Astrophysics — Harvard & Smithsonian
60 Garden St, MS-10
Cambridge, MA 02138, USA
Center for Astrophysics — Harvard & Smithsonian
60 Garden St, MS-10
Cambridge, MA 02138, USA Stephen K. N. Portillo [email protected], [email protected], [email protected]
Abstract
surveys use maximum-likelihood (ML) methods to fit models when extracting photometry from images. We show these ML estimators systematically overestimate the flux as a function of the signal-to-noise ratio and the number of model parameters involved in the fit. This bias is substantially worse for : while a bias is expected for a point source, a galaxy with a simplified Gaussian profile suffers a bias. This bias also behaves differently depending how multiple bands are used in the fit: simultaneously fitting all bands leads the flux bias to become roughly evenly distributed between them, while fixing the position in “non-detection” bands (i.e. forced photometry) gives flux estimates in those bands that are biased low, compounding a bias in derived colors. We show that these effects are present in idealized simulations, outputs from the fake object pipeline (SynPipe), and observations from Stripe 82.
methods: statistical — methods: data analysis — catalogs — techniques: image processing
\turnoffedit
1 Introduction
For example, the weak-lensing studies, supernova cosmology, classifying potentially hazardous asteroids, and to separate out main sequence and giant stars to map the galaxy .
the gold standard for precision photometry was photoelectric measurement with a photomultiplier tube (e.g. Landolt, 1973, 1983). The photons entering an aperture of, e.g., an eight arcsecond radius were detected and counted. In this way, hundreds of stars could be measured with great precision, and these “standard stars” formed the basis of the commonly used photometric systems. measurements of these standards were impressively consistent, but even the “Landolt Faint Standards” (Landolt, 1992) are relatively bright compared to the saturation limit of modern wide-area surveys.
Modern surveys use arrays of pixelated sensors such as charge-coupled devices (CCDs) in the optical/UV bands and mercury-cadmium-telluride (HgCdTe) devices in the near infrared. Surveys then perform “aperture photometry” by adding up all (background-subtracted) counts inside of a circle in an image. This process works well for isolated stars well above the background noise, and forms the basis of, e.g., the Sloan Digital Sky Survey photometric calibration (Padmanabhan et al., 2008).
Aperture photometry, however, has several significant issues that hamper its usage.
Shrinking the size of the aperture can this problem but then results in some amount of light from most sources being excluded, necessitating an “aperture correction” that needs to be calibrated . Aperture photometry can also become quite sensitive to issues relating to background estimation: because all counts in the aperture are being added together, these systematic offsets contribute an increasing portion of the counts with increasing aperture size .
point sources the appropriate 2-D “filter” is the point-spread function. Put another way, PSF photometry involves a parametric generative model of the noiseless data as a function of some parameters . Combining this with an appropriate noise model then yields a likelihood function. Maximizing the likelihood then provides an estimate of the flux with the highest signal-to-noise ratio.
Even for point sources, this process is not always straightforward. While aperture photometry simply requires an appropriately-sized aperture, PSF photometry relies on having an accurate model of the underlying source. This requires knowing the PSF precisely across the image to avoid applying a mismatched filter. While for space-based telescopes the PSF can be quite stable, for ground-based surveys it varies as a function of time and position in the focal plane to such a degree that it can cause flux errors of , dominating the error budget of bright stars (e.g. Padmanabhan et al., 2008).
Furthermore, while PSF photometry can perform better than aperture photometry for fainter objects by avoiding “overweighting” the background, it still struggles in crowded regions where the measured parameters for objects become covariant with those of close neighbors (or even bright neighbors that are not so close). In the extremely crowded limit, the challenge of estimating the PSF and the background level may become severe. These questions are explored elsewhere (Brewer et al., 2013; Portillo et al., 2017) and are not the focus of this paper.
This maximum likelihood approach has been implemented in many software packages that are widely used in astronomy, including DAOPHOT/ALLFRAME (Stetson, 1987, 1994), DoPHOT (Schechter et al., 1993), HSTphot/DOLPHOT (Dolphin, 2000), Photo (Lupton et al., 2001), psphot (Magnier et al., 2016), The Tractor (Lang et al., 2016) in optimization mode, the Hyper Suprime-Cam (HSC) pipeline (Bosch et al., 2018), and the LSST Stack (Jurić et al., 2017). While these software packages differ on implementation details (e.g. estimating the sky level, measuring the PSF, and deblending sources), they are similar in how they measure the positions and fluxes of sources. (PSF/model fitting photometry for point/extended sources), possibly alongside aperture photometry at the same positions.
photometry (in theory) maximizes the SNR (i.e. is most precise), that does not mean that it is unbiased (i.e. is most accurate) or gives proper errors (i.e. has appropriate coverage). Indeed, that maximum-likelihood estimators are generically biased to order . While there are many biases/systematic errors that can appear in photometry (cf. Nyland et al. 2017), in this paper we focus on the biases that arise in photometry due to it being a . We show that this leads the flux density estimates to generally be biased high, and that it broadly arises from PSF photometry “over-fitting” the data in a way that breaks symmetries. While this bias is generally small (a effect at for a point source) and likely not of concern for an individual source, its impact may be magnified when using a large population of low signal-to-noise sources.
In this work, we present corrections for the bias and consider its implications for . We first introduce a simplified version of maximum-likelihood photometry with a single source and Gaussian noise in Section 2. We derive the bias in the maximum-likelihood flux for a point source and the corresponding in Section 3. In Section 4, we compute the bias for multi-band photometry in both the forced photometry and simultaneous fit case. In Section 5, we show that the bias is present in mock data pipelines and SDSS data.
All the data and code used to create the plots presented in the paper are available online at https://github.com/joshspeagle/phot_bias. We invite readers to (re-)create their own plots and investigate the nature of this bias for themselves.
2 Maximum-Likelihood Photometry
Consider an footprint of a pixelated image containing only one point source at some true position with some true flux density . Assume the pixel-convolved point spread function (PSF) is constant throughout the footprint and is exactly known. Let the value of the pixel-convolved PSF in pixel for a point source located at position be . For notational convenience, the values across the whole image will be modeled as the column vector .
Assume that an estimate of the sky background111We define the sky background to be any flux not coming from the sources that are being cataloged, eg. scattered light from the atmosphere and telescope, dark current, and . has been subtracted from the footprint and that (residual) sky background in the footprint is Normally distributed with mean column vector and covariance matrix , where the noise is known but the mean bias is not. The observed background noise in our pixels is then distributed as
[TABLE]
where is the multivariate Normal distribution with mean vector and covariance matrix .
In the case where the background noise is independently and identically distributed (iid) with mean and variance the noise in each pixel follows
[TABLE]
and the mean and covariance become and . Although we will often use vector/matrix notation for compactness, we will derive results assuming the iid case throughout the main text. Some corresponding results for the general case are included in Appendix A.
Excluding noise from the sky background, the value of the model image for a point source at location and with flux density is
[TABLE]
The observed flux densities within the footprint for our object with true flux density at position are then distributed as
[TABLE]
The log-likelihood for a model consisting of a single point source at location with flux and background is then
[TABLE]
When extracting photometry, most often a maximum-likelihood (ML) approach is used. While ML estimators have been widely studied in the statistics literature, we derive some basic results for their application to photometry in this section for completeness. These results are already known in the literature and appear in, for example, King (1983).
2.1 Flux Density
Denote the maximum-likelihood (i.e. best-fit) flux for a given position and background as . By definition, the partial derivative of the log-likelihood at with respect to is zero such that
[TABLE]
where . This yields
[TABLE]
which is equivalent to using the PSF centered at as a matched filter against the background-subtracted image.
We can construct a naive estimate of the error/uncertainty in the maximum-likelihood flux density estimate by calculating the (negative inverse of) the second derivative with respect to flux density
[TABLE]
where is the “effective area” of the PSF. This uncertainty is proportional to the background noise and effective PSF area (i.e. is larger when the PSF is broader). The estimate is naive because it ignores the possible covariances between flux density and other parameters.
As an example, a circular Gaussian PSF with a standard deviation of pixels has an effective area of
[TABLE]
in the oversampled limit where (i.e. the footprint is large compared to the size of the PSF which is also large compared to the size of a single pixel).
2.2 Position
We can define the maximum-likelihood positions the same way by setting the partial derivative of the log-likelihood with respect to to 0. Using as a representative case we get
[TABLE]
Similarly, the naive error/uncertainty in can be found by calculating the second derivative with respect to position:
[TABLE]
This expression has two components. The first involves the square of the first derivative of the pixel-convolved PSF , while the second involves the second derivative of the pixel-convolved PSF weighted by (fractional) model residuals. Assuming that the residuals are sufficiently small relative to and the PSF varies sufficiently slowly across the footprint, we can ignore this term and approximate the error as
[TABLE]
where is the “effective smoothness” of the PSF, analogous to the effective area . Note that the position error is directly proportional the effective PSF smoothness and inversely proportional to the signal-to-noise ratio (SNR) .
As above, for the case with a circular Gaussian PSF with a standard deviation of pixels, the effective smoothness is
[TABLE]
in the oversampled limit for . This gives a corresponding position error of
[TABLE]
where is the naive flux density error estimate from §2.1.
2.3 Background
For a given flux density and position , we can solve for the maximum-likelihood background by again taking the first derivative
[TABLE]
and setting it to [math]. The ML solution is
[TABLE]
The associated naive errors then are
[TABLE]
where is the area of the footprint.
This result shouldn’t be surprising, since it implies that the maximum-likelihood background is just the mean residual between the model and the data in our given footprint. Since we have assumed a fixed value across the footprint, this is summed over all the pixels. Results for the general case where the background is actually a function of a nuisance parameters across the footprint are outlined in Appendix C.
3 Bias in Maximum-Likelihood Estimates
While ML estimators are consistent estimators (i.e. as ) assuming no model mismatch, they are not guaranteed to be unbiased at any finite SNR. In this section, we derive estimates of the expected bias between the ML flux density estimate and the true flux density along with its associated variance in the ideal case where are known (§3.1) and the general case where it is not (§3.2). A schematic outline of our results is illustrated in Figure 1.
3.1 Ideal Case
It is helpful to rewrite the noisy observed flux densities in terms of random variables such that
[TABLE]
where each is an iid random variable drawn from the standard Normal distribution with mean and variance . This represents a re-framing of the underlying data generating process: start with our true underlying model and add on a particular value comprised of drawn from scaled by the standard deviation .
Given equation (18), the likelihood at the true position and background then reduces to
[TABLE]
Setting the partial derivative then allows us to write the ML flux density estimate as
[TABLE]
We can rewrite this by noticing that the latter term is actually normally distributed such that
[TABLE]
where the equality comes from equation (8). This implies that the ML flux density estimate is distributed as
[TABLE]
since the naive error estimate is equivalent to the true error in the single parameter case . In other words, given the true values of , the ML flux density estimate is an unbiased estimate of the true flux density with a variance of corresponding to the true variance.
3.2 General Case
Following our random variable notation above, at the true values of the position, flux density, and background, respectively, we can rewrite the likelihood of the noisy data as
[TABLE]
where the are again iid normal random variables. The sum of their squares represents the sum of the error-normalized residuals. We can rewrite this by recognizing that
[TABLE]
which follows a chi-square distribution with degrees of freedom.
In general, we expect our best-fit parameters to “absorb” some of the scatter present in the data since we allow them to vary when we are trying to maximize the likelihood. We can make this more rigorous using , which implies that the sum of error-normalized residuals for a fit with free parameters will follow
[TABLE]
with the sum of error-normalized residuals for the ML solution distributed as
[TABLE]
such that their combined sum leaves us with
[TABLE]
since .
3.2.1 Decoupled Background
There is an asymmetry between the parameters connected to modeling the object and those connected to modeling the background. When modeling an object, can all modify the model image and provide information on the scale of the PSF. Even as we increase the size of our footprint , there is a minimum variance achievable set by and .
In contrast, the background estimate can continually improve as the image becomes larger. This holds true for any finite-parameter background model (see Appendix C) as the size of the footprint becomes infinitely large. This implies that there is a fundamental difference between “object-related” parameters and “background-related” parameters.
In the case where the area of the footprint is substantially larger than the effective area of the PSF (i.e. ), we can consider the object parameters effectively decoupled from background parameters. In Appendix D, we show that the freedom in the object position parameters leads to a bias222Note that this effect is completely independent of (and unrelated to) selection effects such as Malmquist bias that can also cause sources to naturally be biased high close to detection limits/cutoffs. It is also independent of Eddington bias, which causes the number of bright objects to be overestimated when faint objects are more common than bright ones. in the generalized ML flux density relative to the ideal (unbiased) ML flux density in §3.1 that to leading order goes as
[TABLE]
where is a random variable drawn from the chi-square distribution with 2 degrees of freedom (which is determined by the 2 parameters used to fit for the position). This leads to a fractional bias of
[TABLE]
where is the expectation value of . This shows that is biased high relative to the true underlying flux density . See Appendix D for a more detailed derivation involving higher-order terms and E for an alternate derivation using bias tensors. This bias is the same as the “gradient bias” identified by Ivison et al. (2007) and the “noise bias” identified by Refregier et al. (2012). A related derivation was done by Guy et al. (2010), which neglected the correlation between noise and the maximum likelihood position and obtained a different result. The authors of that paper confirm that when they include the noise-position correlation, they agree with our result (private communication).
This result gives a straightforward procedure to approximately “de-bias” using equation (28). Doing so, however, increases the variance in the measurement to first-order by
[TABLE]
since the exact bias for any individual measurement is not known. This is an example of the bias-variance trade-off and leads to an increase in the total effective error following
[TABLE]
The magnitude of this bias for PSF photometry is generally small but not totally negligible: a nominally source (i.e. ) incurs a 1% bias and 0.5% error underestimate. We will return to this in §6.4 when we examine the behavior of when modeling extended objects.
We test our analytic predictions by creating a set of simulated point-source images and running maximum-likelihood photometry on them. Our simulated images have a point-source with a circular Gaussian PSF of in the center of a pixel image with iid Gaussian noise in each pixel. We simulate sources of nine fluxes ranging from to , evenly spaced in . For each flux, we create 100,000 different simulated images. We solve for the ML parameters using the Trust Region Reflective algorithm (as implemented in scipy), initialized with the true parameters to ensure that the optimizer finds the global maximum. Figure 2 shows that the mean flux bias and flux errors from these simulated images agree well with our predictions.
It is crucial to note this bias does not by itself arise from the fact that ML estimators tend to “overfit” the data when they use more parameters than are justified by the data. If that were the case, we should not have found that the ML estimate at the true position was actually unbiased because the fitted parameters are exactly those used in the data-generating process. Instead, this bias arises because of the way in which this overfitting occurs. At the true position, is allowed to adjust to the noise, but it does so in a symmetrical way: the noise fluctuations in the image are symmetric, and so is just as likely to fluctuate upwards relative to as it is to fluctuate downwards. Once the position is allowed to vary, however, the model can move the source to absorb nearby positive noise. This breaks the symmetry from earlier: the source will tend to stay in the correct position with an overestimated when the noise fluctuates upwards around the true position , but will try to move position and absorb nearby positive noise when the noise fluctuates downwards around . This position-dependent behavior of is broadly illustrated in Figure 1 and shown in more detail in Figures 3 and 4. Averaging over this behavior as a function of position then gives the results derived above. We note that for sources that can be confidently detected (ie. ), the ML positions only move from the true positions by fractions of a pixel (see Equation (14)). Thus the bias arises because the noise modifies the global maximum in the likelihood function, not because it creates a new local peak.
3.2.2 Coupled Background
The result we derived above holds if the image is sufficiently large that background estimation is effectively decoupled from modeling the actual object. In the case where this is not true (i.e. ), we instead need to consider how background estimation is covariant with our model parameters. The mixed 2nd-order derivatives of the log-likelihood give
[TABLE]
assuming that is oversampled and roughly symmetric in . As shown in , the contribution from the mixed partial with respect to and is expected to contribute to a fractional underestimate of the variance proportional to the ratio of the area of the footprint versus the effective area of the PSF. This gives a modified variance of
[TABLE]
since by construction . Note that we have dropped since is now the “true” uncertainty of our ML estimator that takes into account the relative coupling between the ML background estimate and the parameters used to model our object .
Substituting in for into our expressions from §3.2.1 then modifies the effective SNR to give
[TABLE]
When the effective size of the footprint is large compared to the PSF, then and the background is effectively decoupled from the modeling of the object, leading to our results in §3.2.1. When the effective size becomes more comparable to the PSF, then and the covariance between the background and flux density dominate the error budget. This has the effect of increasing the expected bias by decreasing the effective SNR.
While a coupled background worsens the bias that arises when the source position is unknown, no bias arises if the true source position is known and the background is uniform. In this case, the model image is linear in all the free parameters ( and ) and so no bias is incurred. The covariance between the background level and flux density of the source will still inflate the uncertainty in the flux density, as shown in equation (34).
We augment the set of simulation images used to generate Figure 2 by considering five different image sizes approximately evenly distributed in : 11, 13, 15, 23, and 101 pixels. The effective SNR of a given flux will decrease as image size decreases. In Figure 5, we show that our analytic predictions in terms of the effective SNR hold in these simulated images.
We want to note that while this effect is conceptually useful going forward, in practice the impact is extremely small. For instance, in SDSS the background is determined in patches of pixels.333https://www.sdss.org/dr14/algorithms/sky/ With a median seeing of in band444https://www.sdss.org/dr14/imaging/other_info/ and pixel size of (Gunn et al., 1998), . Similarly, in LSST the background will be determined in patches of or pixels.555https://confluence.lsstcorp.org/display/LSWUG/Measurement+in+the+LSST+Stack With a median seeing of in band and pixel size of (LSST Science Collaboration et al., 2009), if the background is determined using pixel patches. A significantly larger effect is present in unWISE (Schlafly et al., 2019), which estimates the sky background in much smaller () pixel regions: with a PSF FWHM of and a pixel scale of (Wright et al., 2010), .
4 Extension to Multi-Band Fitting
We now examine the case where an object is modeled in multiple bands. We will consider two cases. The first (§4.1) is where the object is “detected” in a single band, after which the position is fixed across all the bands. This is analogous to surveys such as The second (§4.2) is where the object is modeled simultaneously across all bands. This is somewhat equivalent to
4.1 Single-band Detection
Let’s assume that our object is detected in band , after which the ML position is fixed. As shown in §3, we expect that the ML flux density estimate in this band to be overestimated by an amount based on the PSF-normalized SNR. Following our previous assumption that the likelihood is multivariate normal around the ML parameters , we would expect the values for our ML flux density estimates in other band to be
[TABLE]
where is again an iid normally distributed random variable, is the noise in band , and we have again assumed that background is known and fixed to the true value . This result is analogous to equation (20), except that we have assumed a mismatch in the position between our model at and the source at .
The expectation value of assuming is fixed is
[TABLE]
Since we expect the mismatched PSF term to be smaller than the matched PSF term , this implies that so that our ML flux densities are underestimated. Note that this bias tends to zero as , again confirming that our ML estimator is consistent in the limit of infinite SNR.
Subsequently taking the expectation value over position then gives the general expression
[TABLE]
where is a 2-D multivariate normal distribution for (see Appendix B). While this integral does not have an analytic solution, since we expect the bias to increase as becomes progressively more offset from , performing an average over possible ML positions further away from the true position should not be able to change our overall bias from an underestimate to an overestimate. Therefore, we arrive at the general conclusion that
[TABLE]
In other words, while our ML flux density estimates tend to be overestimated in the detection band, they will tend to be underestimated in all other bands. The severity of this (reverse) bias depends on the exact properties of the PSF in each band relative to the detection band (which establishes the ML position and associated covariances).
In the particular case where we have a circular Gaussian PSF with a standard deviation of pixels, we can evaluate these biases explicitly. At a fixed offset , the expected value of is
[TABLE]
Assuming that the covariance between and is small such that we can approximate the errors as , the error-normalized offset will be distributed as
[TABLE]
This implies that
[TABLE]
where is the Rayleigh distribution with scale parameter . . Marginalizing over then yields
[TABLE]
Assuming a Gaussian PSF in the detection band with standard deviation and , we get
[TABLE]
This corresponds to a fractional bias of
[TABLE]
In practice, we find that to properly model the bias at lower SNR requires incorporating a slightly higher-order Taylor expansion of our results (i.e. going from 2nd to 4th-order). As above, this expansion in general is non-trivial. However, it can be evaluated explicitly for circular Gaussian PSFs, as shown in Appendix G. Including this additional term then gives
[TABLE]
We test our predictions for forced photometry by creating a set of simulated point-source images in two bands, running maximum-likelihood photometry on one band (the detection band) and forced photometry on the other (the forced band). Our simulated images have a point-source with a circular Gaussian PSF of in the center of a pixel image in each band with iid Gaussian noise in each pixel. We simulate sources with nine fluxes ranging from to in the detection band, evenly spaced in . For each detection band flux, we consider four different forced band fluxes: , , , and fainter than the flux in the detection band. For each flux combination, we create 100,000 different simulated images. Figure 6 shows that the flux in the detection band is overestimated and the flux in the forced band is underestimated, both by a fraction depending on the SNR in the detection band, as well as the bias on the measured color .
4.2 Unforced Photometry in All Bands
When the object is modeled simultaneously across bands (i.e. with a common position), using Cochran’s theorem reveals that the sum of fluxes across bands will be biased high but does not show how the flux in individual bands is biased. To calculate the bias in each band, we use the bias tensor formulation introduced by Cox & Snell (1968), which shows that the leading-order term in this bias for parameter is
[TABLE]
where
[TABLE]
is the bias tensor and is the expectation value with respect to the data for fixed.
The derivation of the bias in with respect to using bias tensors in the single-band case is outlined in Appendix E. There, we show that
[TABLE]
where
[TABLE]
When using multiple bands, the bias for band has the same terms as equation (49) for the single band case, but is smaller because all bands help constrain the position. However, in the bias tensor (equation 50), the is the uncertainty in position that would have been obtained using only band . If all bands have Gaussian PSFs with widths , then the flux bias for band is
[TABLE]
where the sum over is taken over all bands used in the fit. If all bands have the same PSF size , then all bands have the same fractional flux bias of .
We test our analytic predictions for simultaneous fitting by creating a set of two-band simulated point-source images with the same parameters as those used to make Figure 6 for forced photometry. Figure 7 shows that the fractional bias in both bands depends on the combined SNR, as predicted.
5 Application
While the results discussed in the previous sections are present in our simulations, we now turn our attention to real-world datasets to demonstrate that these effects are likely present in most datasets currently used in the astronomy community.
5.1 HSC SynPipe
We first investigate whether this effect is present in more realistic mock catalogs processed by real pipelines. In particular, we use simulated data from Hyper Suprime-Came Subaru Strategic Program (HSC-SSP; Aihara et al. 2018) Synthetic Object Pipeline (SynPipe; Huang et al., 2018). In brief, SynPipe injects fake objects into real images which are then processed by the HSC-SSP Pipeline (Bosch et al., 2018) to test the accuracy/precision of various aspects of the pipeline. These objects are drawn from a realistic color and magnitude distribution based in part on data from COSMOS, and so these mock tests represent fairly realistic realizations of the data seen by the HSC pipeline. See Huang et al. (2018) for additional details.
We analyze the PSF magnitudes for artificial star tests from two tracts (8764 and 9699) with good/poor seeing, respectively, processed using the same SynPipe configuration presented in Huang et al. (2018). The corresponding magnitude offsets and the predicted analytic relations are shown in Figures 8 and 9 for good seeing and poor seeing data, respectively. The magnitude offsets show good agreement with our model predictions, but include an additional systematic that is linear with magnitude. Investigating the source of this additional systematic is beyond the scope of this work666We have contacted many of the main authors of the pipeline so they are now aware of this systematic..
One crucial aspect of these results is that the SynPipe tests nominally represent forced photometric extractions, with detection done in the band. However, our results are almost entirely consistent with unforced photometry, where each band is derived separately. After some investigation, we find that this effect can be accounted for within the forced photometry algorithm used by the HSC pipeline, which effectively allows for limited “re-centering” in each band to improve the fit. Because the allowed range of positions is much larger than the relative positional uncertainties suggested by, e.g., (§2.2) in most cases, this process effectively undoes the forcing effect described in §4.1. This phenomenon – where “forced” photometry from a particular pipeline is not quite what its namesake suggests – highlights the importance of transparency when pipelines provide users with results for conducting detailed analysis.
5.2 Stripe 82
To show that this bias appears in real data, we also look at SDSS catalogs of Stripe 82 (Annis et al., 2014). Stars that are low signal-to-noise in individual “runs” should have magnitudes that are biased high relative to their true values. While we do not have access to those true values, we approximate them using measurements taken from the combined images constructed from all the runs, which give much higher SNR measurements (with negligible bias) relative to the individual runs. We expect stars to be brighter, on average, in the individual run catalogs than in the stacked image catalog. Furthermore, each run and band will have a different bias, due to differences in seeing and sky brightness.
Figure 10 shows the magnitude difference between the individual run catalogs and stacked image catalog for each band and over a range of seeing conditions. The faintest stars are biased brighter in the individual runs, in rough agreement with our predictions. It is interesting to note that the apparent systematic trends seen in these data mirror those in the HSC SynPipe tests, and that the photometry is also described as “forced” photometry from the SDSS pipeline. As the HSC pipeline is in part derived from the SDSS pipeline, these similarities bolster our suspicions that these results are most likely caused by the same algorithmic choices.
6 Discussion
6.1 Aperture Photometry
We have shown that ML methods exhibit a generic bias when estimating the flux density of any particular isolated object from a given footprint. This bias causes flux density to be overestimated, and arises because fitting for an unknown object position does not treat noise fluctuations symmetrically. When the noise at the true position fluctuates low, the fit will be drawn away from the true position by nearby high noise fluctuations; conversely, when the noise at the true position fluctuates high, the fit will tend to remain near the true position. As is expected for any ML estimation bias, it is most pronounced at low SNR, scaling as . This bias becomes worse as the models become more complex (as is the case for extended sources such as galaxies), and also biases colors when fitting across multiple bands with forced photometry.
Given these apparent drawbacks, some astronomers might wonder whether a return to aperture-based methods might present a compelling alternative. We want to offer a few arguments for why ML photometry should still be preferred and offer advice where aperture photometry might be more appropriate.
First, ML photometry performs better in the ideal case, where it is unbiased and follows the true error distribution, i.e. . By contrast, an aperture will “miss” part of an object’s flux, leading to an underestimate in all cases. This correction is not expected to be the same for all sources unless the aperture is adaptively adjusted to match (a few times) the size of the PSF-convolved object, which is rarely the case. The “aperture corrections” involved to capture the total flux subsequently almost always serves as a dominant systematic hindering precise analyses.
Aperture photometry might also not eliminate the “centering bias” described in this work. Since an aperture also requires a position to be centered on, determining a central position for the aperture will likely be subject to the same types of biases as the ML case (§3.2).777Indeed, apertures are often centered using positions determined from either model-fitting approaches or various simplistic heuristics (e.g., “peak hunting”). These expected centering offsets will result in variable amounts of flux being excluded from the aperture, likely biasing aperture photometry to a similar extent as ML photometry. Unlike in the ML case where these biases can be studied using statistical methods, however, apertures by nature make such studies much more difficult.
The derived errors from aperture photometry are also generally larger than those from ML photometry. In the ML case, we showed in §2 that the error for a point source is , where is the effective area of the PSF. For a Gaussian PSF with a standard deviation of pixels, this gives . Since aperture photometry just sums all pixels within a given aperture, the equivalent error for an aperture with radius of pixels is just . Any aperture larger than “2-sigma” then has errors that are strictly larger than those estimated from ML photometry, and even this aperture excludes roughly 5% of the flux, requiring a significant aperture correction.
Aperture photometry is also inherently unstable as the aperture increases. While increasing the size of the aperture ensures a greater amount of the total flux is captured, it also increases the variance proportional to the size. While the SNR from ML photometry strictly improves as more data is added (see §3.2.1), the SNR from aperture photometry strictly degrades (ignoring aperture corrections).
Finally, aperture photometry is unable to integrate information across multiple bands/images. As discussed in §4.2, simultaneously fitting a single model across multiple images strictly improves the SNR and reduces the effective bias. Because apertures assume no model, they are unable to improve their SNR across multiple bands. While this comparison appears to be irrelevant in the examples shown in §5 which all exhibit tendencies equivalent to single-band unforced photometry, it will likely become more relevant in future survey pipelines.
We illustrate the magnitude of the above effects in Figure 11, which shows the magnitude difference between PSF and aperture (aper6) measurements from the same sources in the individual run catalogs as a function of the PSF magnitude measured from the stacked image catalog for each band and over a range of seeing conditions. There are substantial differences between the two measurements even at bright magnitudes that also show complex dependencies as a function of seeing. These most likely arise from the increasing sensitivity of the aperture to issues like background mis-estimation and incorrect aperture corrections. Once these effects are removed, we see behavior suggestive of a SNR*-2* bias. Compared to PSF photometry (see Figure 10), this bias becomes significant at much brighter magnitudes due to substantially larger aperture flux errors.
Ultimately, aperture photometry is appealing because it is so simple: it assumes no model and is straightforward to apply to almost any isolated object. While this leads to many of the drawbacks mentioned above and shown in Figure 11, it can also be desirable in cases where modeling complex sources can be difficult and/or the systematics involved limit the effective SNR of an object below that achievable with ML photometry. It thus serves a valuable purpose in cases where a model for the PSF and/or source cannot be cleanly determined and the source is relatively isolated. Our results suggest that it should only be used judiciously in most other cases.
6.2 Stacked Catalogs
One direct corollary of our results is that users must be extremely careful what exactly “stacking” means when constructing catalogs and estimating photometry from sources. This is important because stacking can occur at multiple levels, ranging from the images themselves to the catalogs produced from them. In the former case, where images are combined before they are processed through a pipeline, modeling the results is somewhat equivalent to the simultaneous fitting approach discussed in §4.2. Assuming that each image has roughly the same error , the effective error from images is expected to decrease to and the bias to decrease accordingly. Stacking on the image level thus reduces both the error and the bias.
In the case where stacking is done on the catalog level, however, each observation will be biased, with a mean of and error of . Averaging the results over many identical catalogs then will reduce the error to , but will not decrease the bias in any meaningful way. This implies that any measurement constructed from a stacked catalog may have systematic biases that far exceed the quoted statistical uncertainties.
Put another way, making catalogs from a series of images and then averaging the measurements across catalogs will not remove the flux density bias, because each catalog is individually biased. If inverse variance weighting is used and all images have the same PSF size, the fractional bias of the average will then be the reciprocal of the average . In contrast, the fractional bias from image stacking or simultaneous fitting is the reciprocal of the total , allowing multiple images of comparable SNR to drive down the bias. If catalogs are to be averaged, each catalog should first be individually debiased so that the average flux across catalogs is also unbiased. This procedure increases the variance in each catalog’s flux since the exact bias for each is not known (the bias-variance trade-off; see §3); however, this increase in variance is similar to that incurred by debiasing the flux measured from a stacked image or simultaneous fit.
For example, Budavári et al. (2017) propose detecting faint sources by stacking at the catalog level rather than the image level. This approach requires making catalogs with low SNR thresholds from each image, which is precisely the regime where the ML flux bias is most important. The catalogs made from each individual image should be debiased before being stacked.
We illustrate this effect in Figure 12 by constructing “stacked catalogs”888We “stack” our observations in flux density (linear) space to compute the weighted arithmetic mean. Note that stacking in magnitude (logarithmic) space (which computes the weighted geometric mean) introduces additional biases. using the same SDSS Stripe 82 data used to generate Figure 10. As expected, the bias remains unchanged regardless of the number of runs used to generate the stack even as the estimated errors (and thus the bias we would predict from the stacked catalogs) decreases substantially.
6.3 Bayesian Inference
The maximum likelihood bias we described can be ameliorated by using Bayesian inference. We show in Appendix F that marginalizing over the possible positions of the source cancels out the ML bias. Bayesian methods using Markov chain Monte Carlo sampling are starting to be implemented in astronomy. The widely-used Tractor code (Lang et al., 2016) returns ML measurements but also allows users to sample the posterior distribution of source parameters. Probabilistic cataloging (Brewer et al., 2013; Portillo et al., 2017) allows users to sample from the posterior in catalog space, yields an ensemble of catalogs that may vary in the number of sources identified. This catalog ensemble is useful in crowded fields to capture uncertainties in source identification and deblending. The computational cost required to sample from a posterior rather than reporting ML parameters can be daunting. As such, it is likely the case that ML methods will continue to be applied in both the near and intermediate terms in modern astronomical image processing pipelines.
6.4 Galaxies
7 Conclusion
In this paper, we study a photometric bias that arises from the maximum-likelihood (ML) estimator in model fitting photometry. It arises (in part) because the ML estimate can “soak up” a small amount of noise such that the fitted position is drawn away from the true position. This leads to an overestimate of the flux density, with a bias scaling with the inverse signal-to-noise ratio () and the number of free parameters in the model. For example, it is 1% for a point-source and 2.5% for a 2-D Gaussian galaxy.
While this leads to an overestimate in the detection band, because the derived position is offset from the true position the flux density will be underestimated by the same in any other bands where the position is forced to the same value. This can double the effective bias in derived colors. By contrast, when all bands are modeled simultaneously, all bands are biased high, but less so than if they had been fit individually. In the case where all of the PSFs are the same size, this bias goes as . If an object’s position is already known to great precision (for example, from a deeper or higher-resolution dataset), then forced photometry using this fixed position also does not suffer this bias. Methods that consider the distribution of possible positions, like Bayesian inference, do not exhibit this bias (see Appendix F).
We provide formulae that can be used in a variety of situations to calculate this bias and thus correct for it:
- •
Single-band point source photometry (Equation 29)
- •
Multi-band forced point source photometry (Equation 45)
- •
Multi-band joint point source photometry (Equation 51)
- •
(Equation LABEL:eqn:bias_par_gal)
We then show that this bias is likely common in many astronomical datasets using both mock HSC-like data and real SDSS data. The results further illustrate the importance of pipelines being transparent about the exact algorithmic implementation, since both tests are consistent with unforced photometry even though the data have been extracted using “forced” methods.
Although maximum-likelihood estimators may be biased, we still strongly encourage using them over simpler aperture-based methods in most cases. While apertures are appealing because of their simplicity, they require difficult-to-model aperture corrections to account for missing flux and likely exhibit similar biases due to offsets in aperture centers relative to the true positions of objects. Apertures also cannot effectively incorporate information across multiple bands, which can substantially reduce any relevant biases by improving the effective signal-to-noise ratio.
Though we have shown derivations in a simplified case, with iid Gaussian noise, Gaussian PSFs, and Gaussian galaxy profiles, this bias is generic to maximum-likelihood photometry estimation and would still arise if these assumptions were relaxed. This analysis could be repeated under different assumptions, like Poissonian noise, real PSFs, and Sersic galaxy profiles. While calculating the relevant corrections will likely be more involved in these more realistic cases, they are likely still tractable through the bias tensor formalism (See Appendix E) or through numerical simulations.
The authors would also like to thank Jim Bosch, Charlie Conroy, Daniel Eisenstein, Song Huang, Željko Ivezić, Ben Johnson, Douglas Scott, Eddie Schlafly, Sandro Tacchella, and Catherine Zucker for discussion and feedback that helped to improve the quality of this work. JSS is eternally grateful to Rebecca Bleich for her patience and support. JSS is supported by the National Science Foundation Graduate Research Fellowship Program. SKNP acknowledges support from the DIRAC Institute in the Department of Astronomy at the University of Washington. The DIRAC Institute is supported through generous gifts from the Charles and Lisa Simonyi Fund for Arts and Sciences, and the Washington Research Foundation. The Hyper Suprime-Cam (HSC) collaboration includes the astronomical communities of Japan and Taiwan, and Princeton University. The HSC instrumentation and software were developed by National Astronomical Observatory of Japan (NAOJ), Kavli Institute for the Physics and Mathematics of the Universe (Kavli IPMU), University of Tokyo, High Energy Accelerator Research Organization (KEK) in Japan, Academia Sinica Institute for Astronomy and Astrophysics (ASIAA) in Taiwan, and Princeton University in the United States. Funding was contributed by the FIRST program from Japanese Cabinet Office; Ministry of Education, Culture, Sports, Science and Technology (MEXT); Japan Society for the Promotion of Science (JSPS); Japan Science and Technology Agency (JST); Toray Science Foundation; NAOJ; Kavli IPMU; KEK; ASIAA; and Princeton University. This paper makes use of data products derived from software developed for the Large Synoptic Survey Telescope (LSST). We thank the LSST Project for making their code available as free software at http://dm.lsstcorp.org. Funding for the Sloan Digital Sky Survey IV has been provided by the Alfred P. Sloan Foundation, the U.S. Department of Energy Office of Science, and the Participating Institutions. SDSS acknowledges support and resources from the Center for High-Performance Computing at the University of Utah. The SDSS web site is www.sdss.org. SDSS is managed by the Astrophysical Research Consortium for the Participating Institutions of the SDSS Collaboration including the Brazilian Participation Group, the Carnegie Institution for Science, Carnegie Mellon University, the Chilean Participation Group, the French Participation Group, Harvard-Smithsonian Center for Astrophysics, Instituto de Astrofísica de Canarias, The Johns Hopkins University, Kavli Institute for the Physics and Mathematics of the Universe (IPMU) / University of Tokyo, the Korean Participation Group, Lawrence Berkeley National Laboratory, Leibniz Institut für Astrophysik Potsdam (AIP), Max-Planck-Institut für Astronomie (MPIA Heidelberg), Max-Planck-Institut für Astrophysik (MPA Garching), Max-Planck-Institut für Extraterrestrische Physik (MPE), National Astronomical Observatories of China, New Mexico State University, New York University, University of Notre Dame, Observatório Nacional / MCTI, The Ohio State University, Pennsylvania State University, Shanghai Astronomical Observatory, United Kingdom Participation Group, Universidad Nacional Autónoma de México, University of Arizona, University of Colorado Boulder, University of Oxford, University of Portsmouth, University of Utah, University of Virginia, University of Washington, University of Wisconsin, Vanderbilt University, and Yale University.
Appendix A Results with Correlated Noise
Assume the noise in our footprint to be Normally distributed such that
[TABLE]
with mean vector and covariance matrix . The corresponding log-likelihood for a point source within our footprint is
[TABLE]
where is the determinant (i.e. the dimension-independent analog of area/volume).
A.1 Flux Density
At the ML flux the derivative with respect to is zero such that
[TABLE]
which yields
[TABLE]
The naive estimate (see §2.1) of the error/uncertainty is then
[TABLE]
A.2 Position
The maximum-likelihood positions can likewise be defined via
[TABLE]
with a naive error/uncertainty of
[TABLE]
A.3 Background
The maximum-likelihood background can likewise be defined using
[TABLE]
which gives
[TABLE]
The associated naive errors then are
[TABLE]
This result shouldn’t be entirely surprising. In §2.3, we noted that the maximum-likelihood background is just the mean residual between the model and the data in our given footprint. In the iid case where we have assumed a fixed value across the footprint, we therefore take the average over all the pixels. In the case where every pixel has a separate possible value for the background, however, this leads to averaging done on a per-pixel basis for . Since fitting the background separately in every pixel always over-fits the data by construction, we also derive results for the more realistic case where is actually a function of a nuisance parameters across the footprint in Appendix C.
A.4 Bias
As in §3, we can rewrite our likelihood in terms of random variable notation such that
[TABLE]
where each is an iid random variable drawn from the standard Normal distribution and is the (symmetric) square root of the covariance matrix. The likelihood at the true position and background then is
[TABLE]
This gives a ML flux density estimate of
[TABLE]
Since
[TABLE]
we then recover
[TABLE]
following §3.1.
Appendix B Results with Correlated Parameter Uncertainties
B.1 Error Underestimation in Two-Parameter Models
B.2 Background Covariance
B.3 Position Covariance
Appendix C Errors with General Background Models
In §A.3, we showed that the maximum-likelihood (ML) solution for a background model across all pixels is
[TABLE]
which had an error estimate of
[TABLE]
This result is singularly uninformative, because it implies that the “best” background model is exactly equal to the model residuals across the entire image.
In most cases, we often seek to parameterize the background as a smooth function of nuisance parameters across the footprint. This gives us
[TABLE]
which we can use to solve for . Following Appendix B, the error estimates can be derived by inverting the Fisher Information Matrix (FIM) whose elements are
[TABLE]
As before, if we assume that our overall residuals are small and that our background model varies sufficiently slowly with respect to , we can approximate this as
[TABLE]
where is the Jacobian. This implies that we can use the Jacobian to linearly project from the -dimensional “data space” into the -dimensional parameter space.
Appendix D Maximum-Likelihood Biases using Cochran’s Theorem
As discussed in §3.2, given the true values of the position, flux density, and background, respectively, we can rewrite the likelihood of the noisy data as
[TABLE]
where are again iid normal random variables and
[TABLE]
follows a chi-square distribution with degrees of freedom. Assuming that the data are normally distributed and our ML parameters are also approximately Normally distributed, we can apply Cochran’s theorem to get
[TABLE]
Assuming the background is known (see §3.2.2), we note that we can relate the distribution of the sum of error normalized residuals around in the decoupled-background case to those around for a constant background model via
[TABLE]
where incorporates the noise “absorbed” by and we have exploited the fact that . Exploiting the fact that
[TABLE]
then allows us to rewrite the above result as
[TABLE]
Although the true position is not known, in the interest of deriving the impact on the flux we can take the approximation that , etc. for all terms that don’t explicitly involve the flux density . This leaves us with an equation of the form
[TABLE]
where and are roughly constant. This has a positive solution at
[TABLE]
Since and are known for a given and the distribution of is known exactly, this gives an expression for the distribution of the unbiased ML estimator . In general, assuming that the residuals are sufficiently small such that , this reduces to
[TABLE]
This can be immediately generalized to a model with model parameters (excluding the background ) to get
[TABLE]
We can write this in a slightly more intuitive form by Taylor expanding around small to get
[TABLE]
This gives a fractional bias of
[TABLE]
At lower SNR it is necessary to include the second-order term from the Taylor expansion to properly model behavior. Including this term gives a modified fractional bias of
[TABLE]
For a typical point source model with parameters , this becomes
[TABLE]
Appendix E Maximum-Likelihood Biases using Bias Tensors
ML estimators have a bias which tends to zero as the signal-to-noise ratio (SNR) increases. Cox & Snell (1968) found that the leading-order bias term for any parameter can be found with
[TABLE]
where
[TABLE]
is the bias tensor and is the expectation value with respect to the data for fixed.
With the background fixed, the non-zero terms in the flux density bias are
[TABLE]
since the off-diagonal elements of the FIM with respect to position are zero (§B.3) and we have substituted in for and . Under similar assumptions as §3.2, is it straightforward to show that
[TABLE]
and thus:
[TABLE]
Substituting in for the definition of allows us to rewrite this as
[TABLE]
which reproduces equation (28).
With the background free, the flux bias has the same terms as Equation E3 except that due to the covariance between the flux and background. Solving and rearranging as above then gives
[TABLE]
which reproduces equation (29).
Appendix F Unbiasedness of Marginalized Mean Flux
While the ML solution is the mode of the likelihood distribution, the mean flux density , marginalizing over position , need not be the ML flux density . Here, we show that the posterior mean flux is unbiased to order in a specific case (flat priors and Gaussian PSF) for illustrative purposes.
As discussed in §2, the maximum likelihood flux varies with position and is maximized at assuming the background is known. While first derivative of with position is zero (by definition), the second derivative is
[TABLE]
under the same oversampled/smoothness assumptions discussed in §2.2. For a circular Gaussian PSF with standard deviation of pixels, the second derivative of the PSF is
[TABLE]
where is the difference in the x-coordinate of the center of pixel and the source. Taking
[TABLE]
and approximating the counts around the source as then gives
[TABLE]
using the definition of from equation (7).
Near the ML position, the ML flux density is approximately
[TABLE]
The likelihood can then be approximated (up to a constant factor ) as
[TABLE]
where we have taken since our PSF is circular. The mean flux density after marginalizing over position is defined as
[TABLE]
Integrating over the flux density yields
[TABLE]
and then using equation (F5) for gives
[TABLE]
Substituting in for our circular Gaussian PSF then gives
[TABLE]
where and are defined as in §3.
Since we showed §3 that the ML estimator at the true position is unbiased, this explicitly demonstrates that the mean flux density, marginalized over position, is also unbiased.
Appendix G Second-Order Expansion in Flux Density
As discussed in Section §4.1, fixing the position of an object to the best-fit from a detection band leads to an underestimate in the flux density in all other bands. If the detected position is distributed as a 2-D Gaussian of width about the true position, the average best-fit flux is given by Equation 43:
[TABLE]
At low signal to noise, the position error obtained by inverting the second derivative with respect to position (Equation 14) needs to be corrected by higher-order terms. The maximum-likelihood is found by setting the partial derivatives of the log-likelihood to zero. We Taylor expand the partial derivatives to leading order about the true parameters :
[TABLE]
with . Evaluating the partial derivatives at the true parameters yields
[TABLE]
[TABLE]
with being independent, normally distributed variables with mean zero and unit variance. The sum over pixels of the second term of Equation G4 can be approximated in the oversampled limit with integrals over the PSF and its derivatives. Using Equations G3 and G4, Equation G2 can be written as a matrix equation:
[TABLE]
with , , and . Solving this matrix equation for and expanding to second order in yields:
[TABLE]
The sums are random variables with mean zero just as are. It can be shown that the expectation value of is zero (ie. the position is unbiased) by evaluating expectation values like
[TABLE]
with the second equality following from the independence of the and the third equality holding in the oversampled limit. Similarly,
[TABLE]
Since has an expectation value of zero, the variance in position can be found by squaring Equation G6 and taking the expectation value. To evaluate the variance in position, these expectation values are needed:
[TABLE]
which yield:
[TABLE]
Approximating the position errors as being distributed as a 2-D Gaussian with this variance and using Equation 43 gives:
[TABLE]
Appendix H Gaussian Galaxies
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Abazajian et al. (2004) Abazajian, K., Adelman-Mc Carthy, J. K., Agüeros, M. A., et al. 2004, AJ, 128, 502
- 2Aihara et al. (2018) Aihara, H., Arimoto, N., Armstrong, R., et al. 2018, PASJ, 70, S 4
- 3Annis et al. (2014) Annis, J., Soares-Santos, M., Strauss, M. A., et al. 2014, Ap J, 794, 120
- 4Bertin (2013) Bertin, E. 2013, PSF Ex: Point Spread Function Extractor, , , ascl:1301.001
- 5Bolzonella et al. (2000) Bolzonella, M., Miralles, J. M., & Pelló, R. 2000, A&A, 363, 476
- 6Bosch et al. (2018) Bosch, J., Armstrong, R., Bickerton, S., et al. 2018, PASJ, 70, S 5
- 7Brewer et al. (2013) Brewer, B. J., Foreman-Mackey, D., & Hogg, D. W. 2013, AJ, 146, 7
- 8Budavári et al. (2017) Budavári, T., Szalay, A. S., & Loredo, T. J. 2017, Ap J, 838, 52
