Semiparametric estimation for incoherent optical imaging

Mankei Tsang

arXiv:1906.04578·eess.IV·October 9, 2019

Semiparametric estimation for incoherent optical imaging

Mankei Tsang

PDF

Open Access

TL;DR

This paper applies semiparametric estimation theory to incoherent optical imaging, deriving bounds and estimators that demonstrate the advantages of SPADE over direct imaging under diffraction and noise.

Contribution

It introduces a semiparametric framework for optical imaging, deriving exact bounds and efficient estimators for both classical and quantum-inspired measurement methods.

Findings

01

SPADE outperforms direct imaging in estimating moments.

02

Exact semiparametric Cramér-Rao bounds are derived for the imaging problem.

03

SPADE remains superior even with limited prior information.

Abstract

The theory of semiparametric estimation offers an elegant way of computing the Cram\'er-Rao bound for a parameter of interest in the midst of infinitely many nuisance parameters. Here I apply the theory to the problem of moment estimation for incoherent imaging under the effects of diffraction and photon shot noise. Using a Hilbert-space formalism designed for Poisson processes, I derive exact semiparametric Cram\'er-Rao bounds and efficient estimators for both direct imaging and a quantum-inspired measurement method called spatial-mode demultiplexing (SPADE). The results establish the superiority of SPADE even when little prior information about the object is available.

Equations324

\overset{ˇ}{h} (n)

\overset{ˇ}{h} (n)

E (\overset{ˇ}{h})

E (\overset{ˇ}{h})

V (\overset{ˇ}{h})

\overset{ˇ}{S}_{j} (n ∣ θ)

\overset{ˇ}{S}_{j} (n ∣ θ)

J

J

S_{j} (x ∣ θ)

S_{j} (x ∣ θ)

V (\overset{ˇ}{β})

V (\overset{ˇ}{β})

⟨ h_{1}, h_{2} ⟩

⟨ h_{1}, h_{2} ⟩

∣∣ h ∣∣

∣∣ h ∣∣

H

H

T

T

ν (\tilde{β} S)

ν (\tilde{β} S)

ν (\tilde{β}^{2})

ν (\tilde{β}^{2})

\tilde{β}_{eff}

\tilde{β}_{eff}

CRB

\tilde{β}_{eff}

\tilde{β}_{eff}

T

T

\dot{β} (f, h)

\dot{β} (f, h)

= \int \tilde{β} (x) h (x) f (x) d μ (x) = ν (\tilde{β} h),

\frac{\partial β}{\partial θ _{j}}

\frac{\partial β}{\partial θ _{j}}

= \dot{β} (f, S_{j}) = ν (\tilde{β} S_{j}) = u_{j},

β (f)

β (f)

γ (f)

γ (f)

\overset{γ}{˙} (f, h)

\overset{γ}{˙} (f, h)

\frac{\partial γ ( f )}{\partial θ _{j}}

\frac{\partial γ ( f )}{\partial θ _{j}}

\tilde{β}_{eff}

\tilde{β}_{eff}

Π (\tilde{β} ∣ T^{⊥})

Π (\tilde{β} ∣ T^{⊥})

f (x)

f (x)

f (x)

f (x)

f (q)

f (q)

H (q ∣ y)

q = 0 \sum \infty H (q ∣ y)

q = 0 \sum \infty H (q ∣ y)

N

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPhotoacoustic and Ultrasonic Imaging · Advanced Optical Sensing Technologies · Spectroscopy Techniques in Biomedical and Chemical Research

Full text

Semiparametric estimation for incoherent optical imaging

Mankei Tsang

[email protected] https://www.ece.nus.edu.sg/stfpage/tmk/ Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117583

Department of Physics, National University of Singapore, 2 Science Drive 3, Singapore 117551

Abstract

The theory of semiparametric estimation offers an elegant way of computing the Cramér-Rao bound for a parameter of interest in the midst of infinitely many nuisance parameters. Here I apply the theory to the problem of moment estimation for incoherent imaging under the effects of diffraction and photon shot noise. Using a Hilbert-space formalism designed for Poisson processes, I derive exact semiparametric Cramér-Rao bounds and efficient estimators for both direct imaging and a quantum-inspired measurement method called spatial-mode demultiplexing (SPADE). The results establish the superiority of SPADE even when little prior information about the object is available.

I Introduction

Two fundamental problems confront incoherent optical imaging: the diffraction limit Born and Wolf (1999); Goodman (2004) and the photon shot noise Mandel and Wolf (1995); Goodman (1985). To quantify their effects on the resolution rigorously, the Cramér-Rao bound (CRB) on the error of parameter estimation Lehmann and Casella (1998) has been widely used, especially in astronomy and fluorescence microscopy Farrell (1966); Tsai and Dunn (1979); Zmuidzinas (2003); Feigelson and Babu (2012); Ram et al. (2006); Small and Stahlheber (2014); Deschout et al. (2014); Chao et al. (2016); von Diezmann et al. (2017); Bettens et al. (1999); Van Aert et al. (2002); de Villiers and Pike (2016). Most previous studies, however, assume that the object has a simple specific shape, such as a point source or two, and only one or few parameters of the object are unknown. Such parametric models may not be justifiable when there is little prior information about the object. Without a parametric model, the CRB seems intractable—infinitely many parameters are needed to specify the object distribution, leading to a Fisher information matrix with infinitely many entries, and then the infinite-dimensional matrix has to be inverted to give the CRB. While there also exist many studies on superresolution that can deal with more general objects de Villiers and Pike (2016); Candès and Fernandez-Granda (2013, 2014); Schiebinger et al. (2017), they either ignore noise or use noise models that are too simplistic to capture the signal-dependent nature of photon shot noise.

To compute the CRB and to evaluate the efficiency of estimators for general objects, here I propose a theory of semiparametric estimation for incoherent optical imaging. Semiparametric estimation refers to the estimation of a parameter of interest in the presence of infinitely many other unknown “nuisance” parameters Bickel et al. (1993); Tsiatis (2006). The method has found many applications in econometrics, biostatistics, and astrostatistics Bickel et al. (1993). A typical example is the estimation of the mean of a random variable when its probability density is assumed to have finite variance but otherwise arbitrary. Thanks to a beautiful Hilbert-space formalism Bickel et al. (1993); Tsiatis (2006), the semiparametric theory is able to compute the CRB for such problems despite the infinite dimensionality and also evaluate the existence and efficiency of semiparametric estimators. Such problems are exactly the type that bedevil the study of imaging thus far, and here I show how the semiparametric theory can be used to yield similarly elegant results for optical imaging.

The optics problem of interest here is the far-field imaging of an object emitting spatially incoherent light Goodman (2004, 1985), with the most important applications being optical astronomy Farrell (1966); Tsai and Dunn (1979); Zmuidzinas (2003); Feigelson and Babu (2012) and fluoresence microscopy Ram et al. (2006); Small and Stahlheber (2014); Deschout et al. (2014); Chao et al. (2016); von Diezmann et al. (2017). With a finite numerical aperture, the imaging system introduces a spatial bandwidth limit to the waves, otherwise known as the diffraction limit Born and Wolf (1999); Goodman (2004). The standard measurement, called direct imaging, records the intensity of the light on the image plane. Recently, quantum information theory inspired the invention of an alternative measurement called spatial-mode demultiplexing (SPADE) Tsang et al. (2016a), which has been shown theoretically Tsang et al. (2016a); Tsang (2017, 2018a, 2019a); Zhou and Jiang (2019); Dutton et al. (2019); Nair and Tsang (2016a); Tsang et al. (2016b); Nair and Tsang (2016b); Lupo and Pirandola (2016); Tsang (2018b); Ang et al. (2017); Lu et al. (2018); Řeháček et al. (2017); Yang et al. (2017); Kerviche et al. (2017); Chrostowski et al. (2017); Řeháček et al. (2017, 2018); Backlund et al. (2018); Napoli et al. (2019); Yu and Prasad (2018); Prasad and Yu (2019); Larson and Saleh (2018); Tsang and Nair (2019); Larson and Saleh (2019); Grace and Guha (2019); Bonsma-Fisher et al. (2019); Tsang (2019b) and experimentally Tang et al. (2016); Tham et al. (2017); Paúr et al. (2016); Yang et al. (2016); Donohue et al. (2018); Hassett et al. (2018); Zhou et al. (2019) to be superior to direct imaging in resolving two sub-Rayleigh sources and estimating the size and moments of a subdiffraction object. Most of the aforementioned studies, however, assume parametric models for the object. Exceptions include Refs. Yang et al. (2016); Tsang (2017, 2018a, 2019a); Zhou and Jiang (2019); Bonsma-Fisher et al. (2019), which consider the estimation of object moments, but the results there are not conclusive—only the CRB for direct imaging was computed exactly Tsang (2018a), while the CRB for SPADE was evaluated only approximately Tsang (2017, 2018a); Zhou and Jiang (2019). Another problem is the existence and efficiency of unbiased moment estimators; again only approximate results have been obtained so far Tsang (2017, 2018a). Building on the established semiparametric theory Bickel et al. (1993); Tsiatis (2006), here I compute the exact semiparametric CRBs and also propose unbiased and efficient moment estimators for both direct imaging and SPADE. These results enable a fair and rigorous comparison of the two measurement methods, which proves the fundamental superiority of SPADE for moment estimation.

This paper is organized as follows. Section II introduces the Fisher information and the CRB for Poisson processes. Section III presents the semiparametric CRB in terms of a Hilbert-space formalism designed for such processes. Section IV introduces the models of direct imaging and SPADE. Section V computes the CRB for moment estimation with direct imaging and proposes an efficient estimator. Section VI shows how the CRB should be modified for a normalized object distribution. Section VII computes the CRB for SPADE and also proposes an efficient estimator. Section VIII uses the CRBs to compare the performances of direct imaging and SPADE, demonstrating the superiority of SPADE for subdiffraction objects. Section IX concludes the paper and points out open issues, while the Appendices detail the technical issues that arise in the main text.

II Cramér-Rao bound for Poisson processes

For optical astronomy Farrell (1966); Goodman (1985); Zmuidzinas (2003); Feigelson and Babu (2012), fluorescence microscopy Ram et al. (2006); Small and Stahlheber (2014); Deschout et al. (2014); Chao et al. (2016); von Diezmann et al. (2017), and even electron microscopy Bettens et al. (1999); Van Aert et al. (2002), Poisson noise can be safely assumed. Suppose that each detector in a photodetector array is labeled by $x\in\mathcal{X}$ , where $\mathcal{X}$ denotes the detector space. Assume that the observed process, such as the image recorded by a camera, is a Poisson random measure $n$ on $\mathcal{X}$ and its $\sigma$ -algebra $\Sigma$ , with a mean given by the intensity measure $\bar{n}$ on the same $(\mathcal{X},\Sigma)$ Çınlar (2011). $n(\mathcal{A})$ for any $\mathcal{A}\in\Sigma$ is then a Poisson variable with mean $\bar{n}(\mathcal{A})$ . For example, if $\mathcal{X}\subseteq\mathbb{R}^{2}$ is a two-dimensional surface, then $\Sigma$ is the set of all subareas that can be defined on the surface, and $n(\mathcal{A})=\int_{x\in\mathcal{A}}dn(x)$ is the detected photon number over the area $\mathcal{A}$ . For any vectoral function $h:\mathcal{X}\to\mathbb{R}^{q}$ on the detector space,

[TABLE]

a linear functional of $n$ , is a random variable with statistics

[TABLE]

where $\mathbb{E}$ denotes the statistical expectation, $\mathbb{V}$ denotes the covariance, $\nu$ denotes the average with respect to the intensity measure $\bar{n}$ , $\top$ denotes the matrix transpose, and all vectors in this paper are column vectors.

Suppose that $\bar{n}$ depends on an unknown vectoral parameter $\theta\in\Theta\subset\mathbb{R}^{p}$ with $p$ entries and has a density $f(x|\theta)$ with respect to a dominating measure $\mu$ such that $f(x|\theta)=d\bar{n}(x|\theta)/d\mu(x)$ . The log-likelihood derivatives are given by Snyder and Miller (1991)

[TABLE]

As $\check{S}$ is a linear functional of $n$ , its covariance, called the Fisher information matrix, can be simplified via Eq. (3) and is given by Snyder and Miller (1991)

[TABLE]

where $S$ is a vector of detector-space functions given by

[TABLE]

Here $\mathbb{V}$ , $\check{S}$ , $\bar{n}$ , $S$ , and $\nu$ are all evaluated at the same $\theta$ , and I assume hereafter that all functions of $\theta$ are evaluated implicitly at the same $\theta$ . Each $S_{j}$ is hereafter called a score function, borrowing the same terminology for $\check{S}$ in statistics Bickel et al. (1993); Tsiatis (2006). An important distinction is that, whereas $\mathbb{E}(\check{S}_{j})=0$ , $\nu(S_{j})$ does have to be zero, since $\bar{n}$ does not have to be normalized.

Let $\beta(\theta)$ be a scalar parameter of interest. If $\beta(\theta)=\theta_{k}$ for example, then all the other parameters in $\theta$ are called nuisance parameters. For any unbiased estimator $\check{\beta}(n)$ , the CRB on its variance is Lehmann and Casella (1998)

[TABLE]

$J^{-1}$ seems intractable if $\theta$ is infinite-dimensional. The next section introduces a cleverer method.

III Semiparametric Cramér-Rao bound

The key to the semiparametric theory is to treat random variables as elements in a Hilbert space Bickel et al. (1993); Tsiatis (2006). Here I introduce another Hilbert space for detector-space functions on top of the statistical one for the purpose of computing the CRB for Poisson processes. Define an inner product between two scalar functions $h_{1},h_{2}:\mathcal{X}\to\mathbb{R}$ as

[TABLE]

and the norm as

[TABLE]

With the inner product, a Hilbert space $\mathcal{H}$ can be defined as the set of all square-summable functions, viz.,

[TABLE]

Denote the set of score functions $\{S_{j}\}$ as $S$ in a slight abuse of notation. If the Fisher information $J_{jj}=\nu(S_{j}^{2})<\infty$ for all $j$ , $S\subset\mathcal{H}$ . Define the tangent space $\mathcal{T}\subseteq\mathcal{H}$ of a parametric model as the linear span of $S$ , or

[TABLE]

Define also an “influence” function as any $\tilde{\beta}\in\mathcal{H}$ that satisfies

[TABLE]

borrowing the name of a similar concept in statistics Bickel et al. (1993); Tsiatis (2006). The Cauchy-Schwartz inequality $\nu(\tilde{\beta}^{2})[w^{\top}\nu(SS^{\top})w]\geq(u^{\top}w)^{2}$ with $w=[\nu(SS^{\top})]^{-1}u$ then yields

[TABLE]

the right-hand side of which coincides with the CRB given by Eq. (7). Define the efficient influence as the influence function that saturates Eq. (13), viz.,

[TABLE]

Equation (14) can be interpreted as the orthogonal projection of any influence function $\tilde{\beta}\in\mathcal{H}$ that satisfies Eq. (12) into $\mathcal{T}$ , viz.,

[TABLE]

Figure 1 illustrates this concept.

Consider now the semiparametric scenario. For the purpose of this paper, it suffices to assume that the dimension of $\theta$ is infinite but countable ( $p=\infty$ ). The score functions are still defined in the same way, but now there are infinitely many of them. The tangent space should be modified to be the closed linear span

[TABLE]

so that projection into it is well defined Reed and Simon (1980), and the semiparametric CRB is still given by Eqs. (12), (15), and (16); see Appendix A for a proof. This Hilbert-space approach is tractable when finding a candidate influence function is straightforward and the tangent space is so large that the candidate is already in it or at least very close to it. If the dimension of $\theta$ is uncountable, the tangent space and the CRB can still be defined via the concept of parametric submodels Bickel et al. (1993); Tsiatis (2006), although it is not needed here.

If $\beta$ can be expressed as a functional $\beta(f)$ , a useful way of finding an influence function is to consider a functional derivative of $\beta(f)$ with respect to $h(x)$ defined as

[TABLE]

which leads to

[TABLE]

and the $\tilde{\beta}(x)$ function obtained from the functional derivative is an influence function that satisfies Eq. (12). The simplest example is the linear functional

[TABLE]

and $\tilde{\beta}(x)$ is an influence function.

If the tangent space is so large that $\mathcal{T}=\mathcal{H}$ , then a square-summable influence function is already in $\mathcal{H}=\mathcal{T}$ and therefore efficient. There are often some constraints that make $\mathcal{T}$ smaller, however, and the CRB is reduced as a result. For example, if the constraint can be expressed as

[TABLE]

and its functional derivative is

[TABLE]

then

[TABLE]

and it follows that $\tilde{\gamma}$ should be placed in the set that spans $\mathcal{T}^{\perp}$ , the orthocomplement of $\mathcal{T}$ in $\mathcal{H}$ . In terms of $\mathcal{T}^{\perp}$ , the efficient influence can be evaluated as

[TABLE]

If $\mathcal{T}^{\perp}=\operatorname{span}(R)$ , then

[TABLE]

which is still tractable if $R$ has a low dimension.

IV Incoherent optical imaging

Consider a distribution of spatially incoherent sources described by the measure $F$ on the object plane with coordinate $y$ , a far-field paraxial imaging system with point-spread function $\psi(z-y)$ for the field Goodman (2004), further processing of the field on the image plane with coordinate $z$ via passive linear optics with Green’s function $\kappa(x,z)$ , and Poisson noise at the output detectors labeled by $x\in\mathcal{X}$ , as depicted by Fig. 2. For simplicity, assume one-dimensional imaging such that $y,z\in\mathbb{R}$ ; generalization for two-dimensional imaging is possible Tsang (2017, 2018a) but not very interesting. The intensity can be described by the mixture model Tsang et al. (2016a, b); Tsang (2017, 2018a); Goodman (1985)

[TABLE]

where the image-plane coordinate $z$ is normalized with respect to the magnification factor, both $y$ and $z$ are normalized with respect to the width of the point-spread function such that they are dimensionless, and $\psi$ is normalized as $\int|\psi(x)|^{2}dx=1$ . This semiclassical Poisson model can be derived from standard quantum optics Tsang (2011); Tsang et al. (2016a); Tsang (2019b).

For direct imaging with infinitesimal pixels, $\kappa(x,z)=\sqrt{\tau}\delta(x-z)$ , where $\tau$ is a positive conversion factor, $x\in\mathcal{X}=\mathbb{R}$ denotes the position of each pixel, $d\mu(x)=dx$ , and the image intensity obeys the convolution model

[TABLE]

which will be studied in Sec. V.

The most remarkable physics of the problem lies in the possibility of improving the measurement via optics with a different Green’s function $\kappa$ . Quantum information theory has shown that substantial improvement is possible for subdiffraction objects, and SPADE has been found to be quantum-optimal in many special cases Tsang et al. (2016a); Tsang (2017, 2018a, 2019a); Zhou and Jiang (2019); Dutton et al. (2019); Nair and Tsang (2016a); Tsang et al. (2016b); Nair and Tsang (2016b); Lupo and Pirandola (2016); Tsang (2018b); Ang et al. (2017); Lu et al. (2018); Řeháček et al. (2017); Yang et al. (2017); Kerviche et al. (2017); Chrostowski et al. (2017); Řeháček et al. (2017, 2018); Backlund et al. (2018); Napoli et al. (2019); Yu and Prasad (2018); Prasad and Yu (2019); Larson and Saleh (2018); Tsang and Nair (2019); Larson and Saleh (2019); Grace and Guha (2019); Bonsma-Fisher et al. (2019); Tsang (2019b). In one version of SPADE, $\kappa^{*}(q,z)$ is the $q$ th mode function in the point-spread-function-adapted (PAD) basis Řeháček et al. (2017); Tsang (2018a), such that the output intensity is given by

[TABLE]

where $\mu$ is simply the counting measure and $\kappa$ and $H$ obey special properties, as further discussed in Sec. VII. For a fair comparison, the quantum efficiencies of direct imaging and SPADE are assumed to be the same, meaning that Tsang (2018a)

[TABLE]

where $\tau$ is the same factor as that for direct imaging. Then

[TABLE]

the expected photon number received in total, is also the same.

V Moment estimation

with direct imaging

Consider the direct-imaging model given by Eq. (29). Assume that $H$ can be expanded in a Taylor series as

[TABLE]

which leads to

[TABLE]

where the unknown parameters are the object moments defined by

[TABLE]

For the CRB to hold, the parameter space should correspond to the condition that $F$ contains an infinite number of point sources with different positions, as discussed in Appendix B. Appendix C shows a way of reconstructing $F$ from $\theta$ via an orthogonal-series expansion, following Ref. Tsang (2019b).

The score function for each $\theta_{j}$ is

[TABLE]

It turns out that the tangent space $\mathcal{T}$ for this problem is equal to the whole Hilbert space $\mathcal{H}$ under certain technical conditions, as shown in Appendix D.

Let the parameter of interest be

[TABLE]

where $u$ is independent of $\theta$ . To find a candidate influence function, a trick Meister (2006) is to consider the image moments $\phi$ given by

[TABLE]

where

[TABLE]

are the monomials. Assuming that all the moments of $F$ and $H$ are finite such that all the moments of $f$ are also finite, $\phi$ can be related to the object moments via

[TABLE]

where

[TABLE]

and

[TABLE]

$C$ is a lower-triangular matrix, and with $C_{jj}=\int H(x)dx=\tau>0$ , $C^{-1}$ is well defined and also lower-triangular even if the dimension of $C$ is infinite, as shown in Appendix E. The object moments can then be related to the image moments by

[TABLE]

and $\beta$ can be expressed as

[TABLE]

According to Eq. (22), an influence function is

[TABLE]

Since $\mathcal{T}=\mathcal{H}$ as shown in Appendix D, the $\tilde{\beta}$ given by Eq. (49) belongs to $\mathcal{H}=\mathcal{T}$ and is efficient according to Eq. (16) as long as it is square-summable. For example, if $u$ contains a finite number of nonzero entries, $\tilde{\beta}$ is a polynomial of $x$ and must be square-summable, since all the moments of $f$ are assumed to be finite. The CRB is hence

[TABLE]

where $C^{-\top}=(C^{-1})^{\top}$ . This result coincides with that derived in Ref. Tsang (2018a) via a more direct but less rigorous method, which is repeated in Appendix F for completeness.

An unbiased and efficient estimator $\check{\beta}(n)$ in terms of the observed process $n$ can be constructed from the efficient influence as

[TABLE]

where the object moment estimator is

[TABLE]

$\check{\beta}(n)$ is a linear filter of $n$ , so its variance is $\mathbb{V}(\check{\beta})=\nu(\tilde{\beta}^{2})$ , which coincides with the CRB. It is important to note that this estimator does not require any knowledge of the unknown parameters, as $\check{\phi}(n)$ is simply the empirical moments of the observed image, and $C^{-1}$ is a lower-triangular matrix that depends on the moments of the point-spread function $H$ . The estimator still works even if the object happens to consist of a finite number of point sources and $\theta$ is on the boundary of the parameter space, although its efficiency in that case is a more difficult question, as explained in Appendix B. Unlike some of the previous studies on superresolution de Villiers and Pike (2016); Candès and Fernandez-Granda (2013, 2014); Schiebinger et al. (2017), the results here place no restriction on the separations of the point sources and also account for Poisson noise explicitly.

VI Constrained Cramér-Rao bound

In imaging, the parameters of interest are often the moments with respect to a normalized object distribution with $\int dF(y)=1$ . A simple way of modeling this is to assume that $\theta_{0}=1$ is known. This constraint also makes the model comparable to those in Refs. Tsang (2017, 2019a, 2019b). Then

[TABLE]

is known as well, implying the constraint $\gamma(f)=\int f(x)dx-N=0$ . The constraint can be differentiated to yield $\dot{\gamma}(f,S_{j})=\nu(S_{j})=\langle S_{j},1\rangle=0$ , leading to $\mathcal{T}^{\perp}=\operatorname{span}(1)$ . The new efficient influence, according to Eqs. (26) and (27), should therefore be

[TABLE]

The constrained CRB is now

[TABLE]

where $\nu_{0}(h)=\nu(h)/\nu(1)$ is the normalized version of $\nu$ . The CRB is necessarily lowered by the constraint. Other approaches to the constrained CRB yield the same result, as discussed in Appendix G.

To construct a near-efficient estimator, suppose that $n(\mathcal{X})=\int dn(x)=L>0$ photons have been detected. Then $dn(x)=\sum_{l=1}^{L}1_{x=X_{l}}$ , and the photon positions $\{X_{1},X_{2},\dots,X_{L}\}$ are independent and identically distributed according to the probability measure $\bar{n}/N$ . Consider the estimator

[TABLE]

It is straightforward to show that

[TABLE]

which is close to the CRB given by Eq. (56) if $L$ is close to its expected value $N$ . The results are then consistent with standard results in semiparametric estimation concerning the moments of a normalized probability measure Bickel et al. (1993).

VII Even-moment estimation

with SPADE

Now consider the SPADE model given by Eqs. (30) and (31) and the Fourier transforms

[TABLE]

Suppose that $\Phi=\{\Phi_{q}(k)\}$ is the PAD basis Řeháček et al. (2017); Tsang (2018a) given by

[TABLE]

where $b=\{b_{q}(k):q\in\mathbb{N}_{0}\}$ is the set of orthonormal polynomials defined by

[TABLE]

The polynomials exist for all $q\in\mathbb{N}_{0}$ as long as the support of $|\Psi(k)|^{2}$ is infinite Dunkl and Xu (2001), and the orthonormality of $\Phi$ ensures that the measurement can be implemented by passive linear optics Tsang et al. (2016a, b); Řeháček et al. (2017). Equation (31) becomes

[TABLE]

As the $b$ polynomials are derived by applying the Gram-Schmidt procedure to the monomials $(1,k,k^{2},\dots)^{\top}$ , their basic properties include $\int|\Psi(k)|^{2}b_{q}(k)k^{r}dk=0$ if $r<q$ , $\int|\Psi(k)|^{2}b_{q}(k)k^{q}dk\neq 0$ , and $b_{q}(k)=(-1)^{q}b_{q}(-k)$ if $|\Psi(k)|^{2}$ is even, as is often the case in optics. These properties lead to

[TABLE]

where $C$ is an upper-triangular matrix ( $C_{qj}=0$ if $j<q$ ) with positive diagonal entries ( $C_{qq}>0$ ). Equation (30) becomes

[TABLE]

which depends on the even moments

[TABLE]

The score function with respect to each $\theta_{2j}$ becomes

[TABLE]

Appendix H proves that $\mathcal{T}=\operatorname{\overline{span}}(S)=\mathcal{H}$ .

To find a candidate influence function, suppose that Eq. (68) can be inverted to give

[TABLE]

An influence function for $\beta=u^{\top}\theta$ according to Eq. (22) is therefore

[TABLE]

Since $\mathcal{T}=\mathcal{H}$ , this $\tilde{\beta}$ belongs to $\mathcal{T}$ and is efficient as long as it is square-summable. The CRB is hence

[TABLE]

A more direct but heuristic way of deriving Eq. (73) is shown in Appendix I. An unbiased and efficient estimator in terms of the observed detector counts $n$ is

[TABLE]

This estimator has a variance $\mathbb{V}(\check{\beta})=\nu(\tilde{\beta}^{2})={\rm CRB}^{(\rm SPADE)}$ , requires no knowledge of any unknown parameter, and still works even if the object happens to consist of a finite number of point sources, with no restriction on their separations. If $\theta_{0}=1$ , the constrained CRB can be derived in ways similar to Sec. VI and Appendix G.

To estimate the odd moments of $F$ via SPADE, variations of the PAD basis are needed Tsang (2017, 2018a). The model is much more complicated and a derivation of the CRB and the efficient estimator is too tedious to work out here, but the upshot is the same: it can be shown that the tangent space for the problem encompasses the whole Hilbert space $\mathcal{H}$ , the efficient influence can be retrieved from the relation $\beta=\nu(\tilde{\beta})$ , and an unbiased and efficient estimator is $\check{\beta}(n)=\int\tilde{\beta}(x)dn(x)$ .

VII.1 Gaussian point-spread function

More explicit results can be obtained and the assumptions can be checked more carefully by assuming the Gaussian point-spread function

[TABLE]

The PAD basis becomes the Hermite-Gaussian basis, and it can be shown that Tsang et al. (2016a); Tsang (2017); Yang et al. (2016)

[TABLE]

The $C$ matrix in Eq. (67) can be determined by expanding $\exp(-y^{2}/4)=\sum_{j=0}^{\infty}(-y^{2}/4)^{j}/j!$ , giving

[TABLE]

It is not difficult to check that the matrix inverse of $C$ is

[TABLE]

which is a degree- $j$ polynomial of $q$ . $\sum_{q=0}^{\infty}\tilde{\theta}_{2j}(q)f(q)$ is the $j$ th factorial moment of $f$ and indeed equal to $\theta_{2j}$ , since $H(q|y)$ is Poisson and its factorial moment is $\sum_{q=0}^{\infty}\tilde{\theta}_{2j}(q)H(q|y)=y^{2j}$ Daley and Vere-Jones (2003). In general, each degree- $j$ moment of $H(q|y)$ is a degree- $j$ polynomial of $y^{2}$ , so each degree- $j$ moment of $f(q)$ is a linear combination of the moments of $F$ up to degree $2j$ . All the moments of $f$ are therefore finite as long as all the moments of $F$ are finite. If $u$ has a finite number of nonzero entries, the influence function given by Eqs. (72) is a polynomial of $q$ , so $\nu(\tilde{\beta}^{2})<\infty$ , and $\tilde{\beta}\in\mathcal{H}$ is ensured.

VII.2 Bandlimited point-spread function

Another important example is the bandlimited point-spread function given by

[TABLE]

$b$ is then the well known set of Legendre polynomials Olver et al. (2010). Appendix J shows the detailed calculations; here I list the results only. Equation (65) becomes

[TABLE]

where $j_{q}(y)$ is the spherical Bessel function of the first kind (Olver et al., 2010, Eq. (10.47.3)). An influence function for estimating $\theta_{2j}$ with $\theta_{2j}=\nu(\tilde{\theta}_{2j})$ is

[TABLE]

where $!!$ denotes the double factorial Olver et al. (2010). $\tilde{\theta}_{2j}(q)$ is a degree- $2j$ polynomial of $q$ , so $\tilde{\beta}(q)$ is also a polynomial of $q$ if $u$ contains a finite number of nonzero entries. As long as all the moments of $F$ are finite, all the moments of $f$ can also be shown to be finite, and $\nu(\tilde{\beta}^{2})<\infty$ is ensured.

Notice that the direct-imaging theory in Sec. V breaks down for this bandlimited point-spread function, as the second and higher even moments of $H(x)=\tau|\psi(x)|^{2}=(\tau/\pi)\operatorname{sinc}^{2}(x)$ are all infinite. The CRB in that case remains an open problem, although it is possible to apodize the point-spread function optically such that all its moments become finite and the semiparametric estimator given by Eq. (52) has a finite variance. For example, if

[TABLE]

then $\Psi(k)$ is infinitely differentiable despite the hard bandwidth limit Debnath and Shah (2015) and all the moments of $|\psi(x)|^{2}$ are finite Tsang (2018a).

VIII Comparison of imaging methods

The advantage of SPADE over direct imaging occurs in the subdiffraction regime, where the width $\Delta$ of the object distribution $F$ with respect to the origin is much smaller than the width of the point-spread function $\psi$ Tsang (2017, 2018a, 2019a, 2019b). As the width of $\psi$ is normalized as $1$ , the regime is defined as

[TABLE]

and the object moments scale as

[TABLE]

With the attainable CRBs given by Eqs. (51) and (73) at hand, I can now compare direct imaging and SPADE on the same semiparametric footing. Consider the estimation of a specific moment $\theta_{k}$ with

[TABLE]

For direct imaging in the subdiffraction regime, the image becomes close to the point-spread function, viz.,

[TABLE]

where $N$ , the expected photon number received in total, is given by Eq. (33). With $C_{jk}=\tau O(1)$ and $\nu(\tilde{\phi}\tilde{\phi}^{\top})=NO(1)$ , the CRB becomes

[TABLE]

For SPADE on the other hand, notice that the $C$ and $C^{-1}$ matrices are upper-triangular, meaning that

[TABLE]

and the CRB for estimating $\theta_{k}$ , where $k$ is even, becomes

[TABLE]

which is much lower than Eq. (88) when $\Delta\ll 1$ and $k\geq 2$ . This is consistent with earlier approximate results Tsang (2017, 2018a). An intuitive explanation of the enhancement, as well as a discussion of the limitations of SPADE, can be found in Ref. Tsang (2019b). The constrained CRB with $\theta_{0}=1$ becomes $[O(\Delta^{k})-\theta_{k}^{2}]/N=O(\Delta^{k})/N$ , which is on the same order of magnitude as the fundamental quantum limit Tsang (2019a).

More exact and pleasing results can be obtained if $\psi$ is Gaussian and given by Eq. (76). Consider for example the estimation of the second moment $\theta_{2}$ . For direct imaging, it can be shown that

[TABLE]

For SPADE on the other hand,

[TABLE]

which not only beats direct imaging by a significant amount in the subdiffraction regime but is in fact superior for all parameter values. To further illustrate the difference between the two methods, suppose that the object happens to be a flat top given by

[TABLE]

Do note that the semiparametric CRBs do not assume the knowledge of this object shape, which is specified here only for the purpose of plotting examples of the CRBs. With $\theta_{2}=\theta_{0}\Delta^{2}/12$ and $\theta_{4}=\theta_{0}\Delta^{4}/80$ , Fig. 3 plots Eqs. (91) and (92) against $\Delta$ in log-log scale. The gap between the two curves in the $\Delta\ll 1$ regime is striking.

With the constraint $\theta_{0}=1$ , the CRBs become

[TABLE]

It is noteworthy that Eq. (95) is exactly equal to the quantum limit given by (Tsang, 2019b, Eq. (E15)), meaning that SPADE is exactly quantum-optimal—at all parameter values—for estimating the second moment. This is consistent with previous results concerning the estimation of two-point separation Tsang et al. (2016a) and object size Tsang (2017); Dutton et al. (2019), but note that the previous results assume that the object shape is known, whereas the CRBs and the estimators here assume the opposite.

IX Conclusion

The semiparametric theory set forth solves an important and difficult problem in incoherent optical imaging: the evaluation of the CRB and the efficient estimation of object parameters when little prior information about the object is available. The theory gives exact and achievable semiparametric CRBs for both direct imaging and SPADE, establishing the superiority and versatility of SPADE beyond the special parametric scenarios considered by previous studies.

Despite the elegant results, the theory has a few shortcomings. On the mathematical side, the conditions for the theory to hold seem difficult to check in the case of direct imaging with a non-Gaussian point-spread function, especially when the point-spread function has infinite moments. It is an open question whether this is merely a technicality or a hint at a whole new regime of statistics. On the practical side, the theory may be accused of assuming ideal conditions for both measurements, such as infinitesimal pixels for direct imaging, the availability of infinitely many modes for SPADE, perfect specification and knowledge of the optical systems, and the absence of excess noise. Reality is necessarily uglier, but the results here remain relevant by serving as fundamental limits (via the data-processing inequality Ibragimov and Has’minskii (1981); Tsang (2019b)) and offering insights into the essential physics. The theoretical and experimental progress on SPADE and related methods so far Tsang et al. (2016a); Tsang (2017, 2018a); Dutton et al. (2019); Tsang (2019a); Zhou and Jiang (2019); Nair and Tsang (2016a); Tsang et al. (2016b); Nair and Tsang (2016b); Lupo and Pirandola (2016); Tsang (2018b); Ang et al. (2017); Lu et al. (2018); Řeháček et al. (2017); Yang et al. (2017); Kerviche et al. (2017); Chrostowski et al. (2017); Řeháček et al. (2017, 2018); Backlund et al. (2018); Napoli et al. (2019); Yu and Prasad (2018); Prasad and Yu (2019); Larson and Saleh (2018); Tsang and Nair (2019); Larson and Saleh (2019); Grace and Guha (2019); Bonsma-Fisher et al. (2019); Tsang (2019b); Tang et al. (2016); Tham et al. (2017); Paúr et al. (2016); Yang et al. (2016); Donohue et al. (2018); Parniak et al. (2018); Hassett et al. (2018); Zhou et al. (2019); Parniak et al. (2018); Paúr et al. (2018, 2019) has provided encouragement that the theory has realistic potential, and the general results here should motivate further research into the wider applications of quantum-inspired imaging methods.

An interesting future direction is to generalize the semiparametric formalism for quantum estimation Helstrom (1976); Holevo (2011). By treating the symmetric logarithmic derivatives of the quantum state $\rho$ as the scores in the $\mathcal{L}_{h}^{2}(\rho)$ space proposed by Holevo Holevo (2011) and adopting a geometric picture Fujiwara (2005), a quantum generalization of the semiparametric CRB can be envisioned, but whether it can solve any important problem, such as the quantum limit to incoherent imaging Tsang (2019a); Zhou and Jiang (2019), remains to be seen.

Acknowledgments

This work is supported by the Singapore National Research Foundation under Project No. QEP-P7.

Appendix A Proof of the semiparametric CRB

for Poisson processes

Define the inner product between two random variables $\check{r}_{1}$ and $\check{r}_{2}$ as

[TABLE]

and the norm as

[TABLE]

Let the Hilbert space of zero-mean random variables be

[TABLE]

and define

[TABLE]

where $\check{S}$ is defined by Eq. (4). Let $\check{\delta}\in\check{\mathcal{R}}$ be any random variable that satisfies

[TABLE]

The semiparametric CRB is Bickel et al. (1993); Tsiatis (2006)

[TABLE]

The proof can be done via a Pythagorean theorem Tsiatis (2006) without recourse to the Cauchy-Schwartz inequality or the existence of $J^{-1}$ . $\check{S}_{j}$ is called a score and $\check{\delta}$ an influence in statistics Bickel et al. (1993); Tsiatis (2006); this paper uses the same terminology for $S$ and $\tilde{\beta}$ in light of their resemblance to the statistical quantities.

The resemblance can be turned into a precise correspondence for a Poisson random measure by considering the subspace $\check{\mathcal{H}}\subset\check{\mathcal{R}}$ of random variables that are linear with respect to $n$ . Any element $\check{h}\in\check{\mathcal{H}}$ can be expressed as

[TABLE]

where $U$ is a surjective linear map from $\mathcal{H}$ to $\check{\mathcal{H}}$ . Since

[TABLE]

by virtue of Eq. (3), $\check{\mathcal{H}}$ is isomorphic to $\mathcal{H}$ and $U$ is unitary Reed and Simon (1980), and since $\check{\mathcal{T}}\subseteq\check{\mathcal{H}}$ and $\check{S}=US$ , $\check{\mathcal{T}}$ is isomorphic to $\mathcal{T}$ . Picking a $\check{\delta}\in\check{\mathcal{H}}$ with

[TABLE]

leads to

[TABLE]

where $\tilde{\beta}_{\rm eff}$ is given by Eq. (16) because of Eq. (102) and the isomorphisms. The CRB becomes

[TABLE]

which is Eq. (15).

Appendix B The moment parameter space

Define an $s\times s$ Hankel matrix with respect to a real-number sequence $\theta=(\theta_{0},\theta_{1},\dots)^{\top}$ as

[TABLE]

If $\theta$ is a moment sequence that arises from a nonnegative measure $F$ ,

[TABLE]

is nonnegative for any real vector $w$ , and all Hankel matrices are positive-semidefinite, viz.,

[TABLE]

Conversely, any sequence with Hankel matrices that obey Eq. (110) can be expressed in the form of Eq. (36) with a nonnegative $F$ by virtue of Hamburger’s theorem Schmüdgen (2017).

For the CRB to hold for a $p$ -dimensional $\theta$ , the parameter space $\Theta$ should be an open subset of $\mathbb{R}^{p}$ Ibragimov and Has’minskii (1981); Gorman and Hero (1990). Intuitively, the requirement makes sense because all the parameters in $\theta$ are unknown and $\theta$ should be allowed to vary in a neighborhood, otherwise the problem would be overparametrized. If $\Theta$ is not an open subset, the parameter space would be constrained and the CRB may be modified Gorman and Hero (1990). When all the moments are unknown parameters, consider the set

[TABLE]

Each $\theta\in\Theta$ corresponds to a measure with infinite support $r=\infty$ Schmüdgen (2017). The proof can be done by observing that the polynomial in Eq. (109) has at most $s-1$ zeros and the integral is strictly positive for any $w\neq 0$ if and only if $r\geq s$ , and therefore the constraint for $\Theta$ is satisfied if and only if $r=\infty$ . For $r<\infty$ , $F$ can be expressed in terms of its support $\{y_{l}:0\leq l\leq r-1,y_{l}<y_{l+1}\}$ as

[TABLE]

In the context of optics, $r$ is the minimum number of point sources that can describe the object distribution. The assumption of Eq. (111) as the parameter space is consistent with the infinite-support assumption for semiparametric estimation with mixture models (Bickel et al., 1993, Sec. 6.5), and it also makes intuitive sense, at least as a necessary condition—with $r$ point sources, there are only $2r$ unknown parameters, and the problem would be overparametrized if all the moments are assumed to be unknown. Further inequality constraints on $\theta$ may be needed to ensure the convergence of the Taylor series in Eqs. (35) and (68), although they should not affect the CRB as long as $\theta$ obeys and stays away from them Gorman and Hero (1990).

The boundary of $\Theta$ corresponds to measures with finite support $r<\infty$ . If $s\leq r$ , then $M^{(s)}>0$ and $M^{(s)}$ is full-rank (rank $=s$ ), but if $s>r$ , I can write

[TABLE]

$V$ is the Vandermonde matrix and invertible since $\{y_{l}\}$ are assumed to be distinct Horn and Johnson (1985). With $M^{(s)}\geq 0$ and $\operatorname{diag}(F)\geq 0$ , Sylvester’s law of inertia Horn and Johnson (1985) implies that the rank of $M^{(s)}$ is the same as the rank of $\operatorname{diag}(F)$ , which is $r$ . In other words, the rank of $M^{(s)}$ is $\min(r,s)$ , and any finite $r$ means that $M^{(s)}$ is rank-deficient and does not satisfy the strict $M^{(s)}>0$ for all $s>r$ . Whether the CRB still holds for $\theta$ on the boundary is a difficult question, considering that the parameter space here is infinite-dimensional and it is not obvious how existing finite-dimensional results regarding the CRB on a boundary Gorman and Hero (1990) can be applied.

Appendix C Series expansion of the object

distribution

Assume that the object measure $F$ can be expressed as the orthogonal series

[TABLE]

with respect to a reference measure $F^{(0)}$ , where $\{g_{j}=\sum_{k=0}^{\infty}G_{jk}y^{k}:j\in\mathbb{N}_{0}\}$ are the orthogonal polynomials defined by

[TABLE]

and $G$ is a lower-triangular matrix with nonzero diagonal entries that can be obtained by the Gram-Schmidt procedure. Thus each “Fourier” coefficient $\xi_{j}$ can be expressed in terms of the moment parameters as

[TABLE]

which can be written as

[TABLE]

Hence each $\theta$ corresponds to a set of coefficients $\xi$ that can be used to represent $F$ via Eq. (115). It is straightforward to transform the CRBs and the efficient estimators derived in this paper for $\theta$ to those for $\xi$ via Eqs. (118).

Appendix D Tangent space for the direct-imaging

model

Consider the tangent space $\mathcal{T}$ given by Eq. (17) and the score functions given by Eq. (37) for direct imaging. First note that $S\subset\mathcal{H}$ , as the Fisher information $J_{jj}=\langle S_{j},S_{j}\rangle=\nu(S_{j}^{2})$ is assumed to be finite for all $j$ . Recent calculations in quantum estimation theory suggest that this assumption is reasonable for any measurement Tsang (2019a). To prove $\mathcal{T}=\operatorname{\overline{span}}(S)=\mathcal{H}$ , the standard method is to show that the only element in $\mathcal{H}$ orthogonal to $S$ is [math] Reed and Simon (1980), that is,

[TABLE]

implies $h=0$ (almost everywhere with respect to $\bar{n}$ ). Here I list a few approaches with various levels of rigor.

The first approach is to consider the set of orthogonal polynomials

[TABLE]

where $A$ is a lower-triangular matrix with nonzero diagonal entries and can be determined by applying the Gram-Schmidt procedure to the monomials $\tilde{\phi}(x)$ Dunkl and Xu (2001). Under rather general conditions on $f$ , the polynomials form an orthonormal basis of $\mathcal{H}$ Dunkl and Xu (2001), viz.,

[TABLE]

and I can write Eq. (119) as

[TABLE]

or, more compactly,

[TABLE]

If the only solution to Eq. (123) is $w=0$ , then the only solution to Eq. (122) is $h=0$ . This is equivalent to the condition that $B^{\top}$ is injective.

Integration by parts yields

[TABLE]

where $C$ is the same as Eq. (45). Since both $A$ and $C$ are lower-triangular with nonzero diagonal entries, $B=AC$ is also lower-triangular with nonzero diagonal entries, and $B^{\top}$ has a well defined matrix inverse $(B^{\top})^{-1}=(B^{-1})^{\top}=A^{-\top}C^{-\top}$ in the sense that

[TABLE]

where $I$ is the identity matrix; see Appendix E for details. If the matrices were finite-dimensional, the existence of a matrix inverse would imply

[TABLE]

and the only solution to Eq. (123) would be $w=0$ . This proof is not entirely satisfactory however, as Eq. (126) assumes that the product of the infinite-dimensional matrices is associative. Associativity assumes that the order of the sums involved in the matrix product can be interchanged, but it cannot be guaranteed for infinite-dimensional matrices. In other words, the existence of a matrix inverse for $B^{\top}$ may not imply that $B^{\top}$ is injective.

A more rigorous approach is to define

[TABLE]

and notice that Eq. (119) implies

[TABLE]

The unique solution to Eq. (129) is $h=0$ if the family $\{H(x-y):y\in\mathcal{Y}\}$ satisfies a property called completeness in statistics Lehmann and Casella (1998). For example, if $H$ is Gaussian, $\{H\}$ is a full-rank exponential family for any open subset $\mathcal{Y}\subset\mathbb{R}$ and therefore complete Lehmann and Casella (1998). An even more rigorous formulation of this approach Bickel et al. (1993) is to treat $\langle h,\chi_{y}\rangle$ as an operator that maps $h\in\mathcal{H}$ to a function of $y$ in another Hilbert space, and then show that the null space of the operator consists of only $h=0$ . The proof again boils down to the requirement that $\{H\}$ should be complete; see Ref. (Bickel et al., 1993, Sec. 6.5).

Appendix E Inverse of an infinite-dimensional triangular matrix

Let $C$ be an infinite-dimensional matrix with entries indexed by $(j,k)\in\mathbb{N}_{0}^{2}$ . Define its formal matrix inverse $C^{-1}$ as another infinite-dimensional matrix that satisfies

[TABLE]

If $C$ is lower-triangular with nonzero diagonal entries, viz.,

[TABLE]

then $C^{-1}$ can be found by a recursive relation as follows. Define $C^{(s)}$ as the $s\times s$ upper-left submatrix of $C$ . Write $C^{(s+1)}$ and $(C^{-1})^{(s+1)}$ as the partitions

[TABLE]

Given $(C^{-1})^{(s)}=(C^{(s)})^{-1}$ ,

[TABLE]

and the recursion starts from $(C^{-1})^{(1)}=(C^{(1)})^{-1}$ with one element $(C^{-1})_{00}=1/C_{00}$ . The matrix inverse of an infinite-dimensional upper-triangular matrix can be defined in a similar way.

Although the product of infinite-dimensional matrices may not be associative, it can still be proved by induction that $D(Cu)=(DC)u$ for any vector $u$ if $D$ and $C$ are lower-triangular, because

[TABLE]

involves finite sums only. Thus it is safe to assume that $C^{-1}(Cu)=(C^{-1}C)u=u$ and $C(C^{-1}u)=(CC^{-1})u=u$ if $C$ is lower-triangular with nonzero diagonal entries.

Appendix F An alternative derivation

of the Cramér-Rao bound for direct imaging

Consider the problem described in Sec. V. Since the polynomials given by Eq. (120) are an orthonormal basis, the information matrix for the moment parameters can be expressed as

[TABLE]

meaning that $J=B^{\top}B$ , where $B=AC$ is given by Eq. (124). Ignoring the complications of multiplying infinite-dimensional matrices, the CRB becomes

[TABLE]

To evaluate $A^{-1}A^{-\top}$ , notice that the orthonormality of $a$ can be written as

[TABLE]

where $\tilde{\phi}$ is the monomials given by Eq. (40). In other words,

[TABLE]

giving

[TABLE]

This leads to Eq. (51) for the parameter $\beta=u^{\top}\theta$ .

Appendix G Alternative approaches

to the constrained Cramér-Rao bound

One way of deriving the constrained CRB if $\theta_{0}$ is known is to consider the information matrix $\tilde{J}$ with respect to the parameters $\vartheta=(\theta_{1},\theta_{2},\dots)^{\top}$ without $\theta_{0}$ . Then $\theta=(\theta_{0},\vartheta^{\top})^{\top}$ , and $\tilde{J}$ can be written as the submatrix of $J$ , or

[TABLE]

where $j$ is a column vector. Ignore the complications of dealing with infinite-dimensional matrices and partition $J^{-1}$ similarly as

[TABLE]

Then it is straightforward to show that

[TABLE]

Let $\tilde{\vartheta}=(\tilde{\theta}_{1},\tilde{\theta}_{2},\dots)^{\top}$ , and observe that $\tilde{\theta}_{0}=1/C_{00}$ from Eqs. (50), (45), and (39). Then Eq. (51) implies that

[TABLE]

Hence

[TABLE]

which implies Eq. (56) if the parameter of interest is defined as $\beta=u^{\top}\theta$ with $u_{0}=0$ .

Yet another way of deriving the constrained CRB can be found in Ref. Gorman and Hero (1990), which can be shown to lead to the same result here.

Appendix H Tangent space for the SPADE model

The proof is similar to the first approach in Appendix D. Consider $\mathcal{H}=\operatorname{\overline{span}}(a)$ in terms of an obvious orthonormal basis

[TABLE]

Any $h\in\mathcal{H}$ orthogonal to the $S$ given by Eq. (70) obeys

[TABLE]

which can be written as

[TABLE]

As $C$ is upper-triangular with nonzero diagonal entries and $f>0$ is assumed, $B^{\top}$ is lower-triangular with nonzero diagonal entries, and induction can be used to prove that the only solution to $B^{\top}w=0$ is $w=0$ , or equivalently $h=0$ . Hence $\mathcal{T}=\operatorname{\overline{span}}(S)=\mathcal{H}$ . The proof is easier than the one in Appendix D because $B^{\top}$ here is lower-triangular rather than upper-triangular.

An alternative proof, similar to the second approach in Appendix D and Ref. (Bickel et al., 1993, Sec. 6.5) but less fruitful in this case, is to define

[TABLE]

consider

[TABLE]

and use the completeness of $\{H(q|y):y\in\mathcal{Y}\}$ to prove the unique solution $h=0$ . If $H$ is Poisson, for example, then $\{H\}$ is a full-rank exponential family and therefore complete Lehmann and Casella (1998).

Appendix I An alternative derivation of the

Cramér-Rao bound for SPADE

Consider the problem described in Sec. VII. With the orthonormal basis given by Eq. (148) and the $B$ matrix given by Eq. (151), the information matrix with respect to the moment parameters can again be expressed as $J=B^{\top}B$ according to Eq. (136). With Eq. (151), $B^{-1}$ becomes

[TABLE]

Ignoring the complications of multiplying infinite-dimensional matrices, the CRB $J^{-1}=B^{-1}B^{-\top}$ is

[TABLE]

where $D$ is given by Eq. (74), and the CRB for $\beta=u^{\top}\theta$ coincides with Eq. (73).

Appendix J Calculations concerning

SPADE for a bandlimited point-spread function

Consider the point-spread function given by Eq. (80). The standard Legendre polynomials are defined in terms of

[TABLE]

such that the orthonormal version is

[TABLE]

The Fourier transform of the polynomial is (Olver et al., 2010, Eq. (18.17.19))

[TABLE]

where $j_{q}(y)$ is the spherical Bessel function of the first kind (Olver et al., 2010, Eq. (10.47.3)). Then Eq. (81) follows from Eq. (65) and (158).

Let

[TABLE]

From Ref. (Olver et al., 2010, Eq. (10.60.2)), one can derive the useful formula

[TABLE]

For example, since $P_{q}(1)=1$ , one can check that $\sum_{q=0}^{\infty}\tilde{H}(q|y)=1$ in accordance with Eq. (32). Using the facts

[TABLE]

it can also be shown that

[TABLE]

which leads to

[TABLE]

An influence function for estimating $\theta_{2j}$ is hence

[TABLE]

which obeys $\theta_{2j}=\nu(\tilde{\theta}_{2j})$ . To derive an explicit expression for $P_{q}^{(j)}(1)$ , consider the Rodrigues formula (Olver et al., 2010, Eq. (14.7.13))

[TABLE]

which leads to

[TABLE]

and Eq. (82) results.

To bound the moments of $\tilde{H}$ and $f$ , consider a lower bound on the binomial coefficient for $j\geq 1$ given by (Cormen et al., 2009, pp. 1186)

[TABLE]

such that each even moment of $\tilde{H}$ can be bounded as

[TABLE]

This means that each even moment of $f(q)$ is bounded as

[TABLE]

With $\nu(q^{0})=\nu(1)=\tau\theta_{0}$ , $\nu(q^{2j})<\infty$ for all $j\in\mathbb{N}_{0}$ as long as $\theta_{0}$ and $\theta_{2j}$ are finite. Odd moments can be bounded via the Cauchy-Schwartz inequality $[\nu(q^{j})]^{2}\leq\nu(1)\nu(q^{2j})$ . Hence all the moments of $f$ are finite as long as all the moments of $F$ are finite.

Bibliography78

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Born and Wolf (1999) Max Born and Emil Wolf, Principles of Optics: Electromagnetic Theory of Propagation, Interference and Diffraction of Light (Cambridge University Press, Cambridge, 1999).
2Goodman (2004) Joseph W. Goodman, Introduction to Fourier Optics (Mc Graw-Hill, New York, 2004).
3Mandel and Wolf (1995) Leonard Mandel and Emil Wolf, Optical Coherence and Quantum Optics (Cambridge University Press, Cambridge, 1995). · doi ↗
4Goodman (1985) Joseph W. Goodman, Statistical Optics (Wiley, New York, 1985).
5Lehmann and Casella (1998) E. L. Lehmann and George Casella, Theory of Point Estimation (Springer, New York, 1998). · doi ↗
6Farrell (1966) Edward J. Farrell, “Information Content of Photoelectric Star Images,” Journal of the Optical Society of America 56 , 578–587 (1966) . · doi ↗
7Tsai and Dunn (1979) Ming-Jer Tsai and Keh-Ping Dunn, Performance Limitations on Parameter Estimation of Closely Spaced Optical Targets Using Shot-Noise Detector Model , Tech. Rep. ADA 073462 (Lincoln Laboratory, MIT, 1979).
8Zmuidzinas (2003) Jonas Zmuidzinas, “Cramér–Rao sensitivity limits for astronomical instruments: implications for interferometer design,” Journal of the Optical Society of America A 20 , 218–233 (2003) . · doi ↗