Resolving starlight: a quantum perspective

Mankei Tsang

arXiv:1906.02064·quant-ph·April 10, 2020

Resolving starlight: a quantum perspective

Mankei Tsang

PDF

TL;DR

This paper explores how quantum information theory can improve optical imaging by surpassing classical limits, offering new methods that approach quantum bounds for resolving closely spaced light sources.

Contribution

It introduces a quantum formalism for imaging that generalizes previous work, demonstrating potential for significant improvements in resolving subdiffraction objects.

Findings

01

Quantum-inspired methods outperform direct imaging for sub-Rayleigh sources.

02

Theoretical and experimental results show approaches near quantum limits.

03

Potential applications in astronomy and fluorescence microscopy.

Abstract

The wave-particle duality of light introduces two fundamental problems to imaging, namely, the diffraction limit and the photon shot noise. Quantum information theory can tackle them both in one holistic formalism: model the light as a quantum object, consider any quantum measurement, and pick the one that gives the best statistics. While Helstrom pioneered the theory half a century ago and first applied it to incoherent imaging, it was not until recently that the approach offered a genuine surprise on the age-old topic by predicting a new class of superior imaging methods. For the resolution of two sub-Rayleigh sources, the new methods have been shown theoretically and experimentally to outperform direct imaging and approach the true quantum limits. Recent efforts to generalize the theory for an arbitrary number of sources suggest that, despite the existence of harsh quantum limits,…

Equations154

MSE (θ)

MSE (θ)

FI^{(direct)} (θ) = C (θ) N,

FI^{(direct)} (θ) = C (θ) N,

FI (θ) \leq HI (θ) = C (\infty) N .

FI (θ) \leq HI (θ) = C (\infty) N .

FI^{(SPADE)} (θ)

FI^{(SPADE)} (θ)

f (x) = ∣ ψ (x) ∣^{2},

f (x) = ∣ ψ (x) ∣^{2},

g_{q}

g_{q}

f (x)

f (x)

g_{q}

g_{q}

ψ (x \pm \frac{θ}{2})

ψ (x \pm \frac{θ}{2})

FI^{(direct)} (θ)

FI^{(direct)} (θ)

f (x ∣ θ) \approx ∣ ψ (x) ∣^{2} + \frac{θ ^{2}}{8} \frac{\partial ^{2} ∣ ψ ( x ) ∣ ^{2}}{\partial x ^{2}} .

f (x ∣ θ) \approx ∣ ψ (x) ∣^{2} + \frac{θ ^{2}}{8} \frac{\partial ^{2} ∣ ψ ( x ) ∣ ^{2}}{\partial x ^{2}} .

FI (θ)

FI (θ)

FI^{(direct)}

FI^{(direct)}

FI^{(direct)}

FI^{(direct)}

ψ (x - X)

ψ (x - X)

N g_{1}

N g_{1}

ψ (x)

ψ (x)

ψ (x - X)

ψ (x - X)

{ϕ_{q} (x) \to Φ_{q} (k) = (- i)^{q} b_{q} (k) Ψ (k) : q \in N_{0}},

{ϕ_{q} (x) \to Φ_{q} (k) = (- i)^{q} b_{q} (k) Ψ (k) : q \in N_{0}},

⟨ u (k), v (k)⟩ \equiv \int_{- \infty}^{\infty} d k ∣Ψ (k) ∣^{2} u^{*} (k) v (k),

⟨ u (k), v (k)⟩ \equiv \int_{- \infty}^{\infty} d k ∣Ψ (k) ∣^{2} u^{*} (k) v (k),

\int_{- \infty}^{\infty} d x ϕ_{q}^{*} (x) ψ (x - X)

\int_{- \infty}^{\infty} d x ϕ_{q}^{*} (x) ψ (x - X)

\approx p = 0 \sum q \frac{( - i X ) ^{p}}{p !} \int_{- \infty}^{\infty} d k Φ_{q}^{*} (k) Ψ (k) k^{p}

= p = 0 \sum q \frac{( - i X ) ^{p}}{p !} i^{q} ⟨ b_{q} (k), k^{p} ⟩ = c_{q} X^{q},

N g_{q}

N g_{q}

θ_{μ}

θ_{μ}

N g_{q}

N g_{q}

MSE_{2 q}^{(SPADE)}

MSE_{2 q}^{(SPADE)}

MSE_{μ}^{(direct)}

MSE_{μ}^{(direct)}

SNR_{μ}

SNR_{μ}

SNR_{2 q}^{(SPADE)}

SNR_{2 q}^{(SPADE)}

SNR_{μ}^{(direct)} = N O (Δ^{2 μ}),

SNR_{μ}^{(direct)} = N O (Δ^{2 μ}),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Resolving starlight: a quantum perspective

Mankei Tsang

[email protected] https://blog.nus.edu.sg/mankei/ Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117583

Department of Physics, National University of Singapore, 2 Science Drive 3, Singapore 117551

Abstract

The wave-particle duality of light introduces two fundamental problems to imaging, namely, the diffraction limit and the photon shot noise. Quantum information theory can tackle them both in one holistic formalism: model the light as a quantum object, consider any quantum measurement, and pick the one that gives the best statistics. While Helstrom pioneered the theory half a century ago and first applied it to incoherent imaging, it was not until recently that the approach offered a genuine surprise on the age-old topic by predicting a new class of superior imaging methods. For the resolution of two sub-Rayleigh sources, the new methods have been shown theoretically and experimentally to outperform direct imaging and approach the true quantum limits. Recent efforts to generalize the theory for an arbitrary number of sources suggest that, despite the existence of harsh quantum limits, the quantum-inspired methods can still offer significant improvements over direct imaging for subdiffraction objects, potentially benefiting many applications in astronomy as well as fluorescence microscopy.

I Ingredients of the resolution problem: diffraction, photon

shot noise, statistics

In 1879 Lord Rayleigh proposed a criterion of resolution for incoherent imaging in terms of two point sources Rayleigh (1879): the sources are said to be unresolvable if they are so close that their images, blurred by diffraction, overlap significantly. To quote Feynman Feynman et al. (2013), however, “Rayleigh’s criterion is a rough idea in the first place,” and a better resolution can be achieved “if sufficiently careful measurements of the exact intensity distribution over the diffracted image spot can be made.” Thus another limiting factor is the noise in the intensity measurement, with the photon shot noise being the most fundamental source. Because of the particle nature of light, each camera pixel can record its energy in discrete quanta only, and ordinary light sources, including starlight and fluorescence, introduce further randomness to the quantum measurements Mandel and Wolf (1995).

To incorporate noise in the definition of resolution, the theory of statistical inference offers a rigorous framework den Dekker and van den Bos (1997); de Villiers and Pike (2016). For example, a measure of resolution can be defined in terms of parameter estimation: given a blurry and noisy image of two point sources, how well can one estimate their separation Falconi (1967); Tsai and Dunn (1979); Bettens et al. (1999); Van Aert et al. (2002); Ram et al. (2006)? Or it can be framed in terms of hypothesis testing: how well can one decide from the image whether there is one or two sources Harris (1964); Acuna and Horowitz (1997); Shahram and Milanfar (2004, 2006)? Such statistical treatments of resolution have garnered prominence in optical astronomy Farrell (1966); Falconi (1967); Lucy (1992a, b); Acuna and Horowitz (1997); Zmuidzinas (2003); Feigelson and Babu (2012) and fluorescence microscopy Ram et al. (2006); Deschout et al. (2014); Chao et al. (2016); von Diezmann et al. (2017); Zhou et al. (2019a), where the number of photons is limited and shot noise is part of life.

II Quantum detection

and estimation theory

Imaging has grown into a multidisciplinary problem that straddles optics, quantum mechanics, statistics, and signal processing. In a Herculean effort that began in the 1960s, Helstrom merged the subjects into a theory of quantum detection and estimation Helstrom (1976), which marked the beginning of quantum information theory. His aim was to determine the best measurement, out of the infinite possibilities offered by quantum mechanics, that optimizes the performance of an inference task. For a given light source, the optimal performance then represents the most fundamental limit on the resolution, valid for any optics design that is allowed by quantum mechanics, as well as any computational technique in data postprocessing. In setting fundamental limits, Helstrom’s theory plays a role for sensing and imaging not unlike the second law of thermodynamics for engines, ruling out unphysical superresolution methods in the same manner the second law rules out perpetual-motion machines.

The mathematics was formidable, but Helstrom managed to apply his theory to a few simple scenarios of incoherent imaging. For example, he studied the problem of locating an incoherent point source from far-field measurements Helstrom (1970), but the result was unsurprising: the quantum limit is close to the ideal performance of direct imaging, which measures the intensity on the image plane, as depicted by Fig. 1. A more intriguing problem he studied was the decision between one or two incoherent sources Helstrom (1973). Helstrom computed the mathematical form of the optimal measurement and the resulting error probabilities, but he did not propose an experimental setup or show how much improvement the optimal measurement could offer over existing imaging methods. Helstrom himself was quite pessimistic Helstrom (1973): “The optimum strategies required in order to attain the minimum error probabilities calculated here require the measurement of certain complicated quantum-mechanical projection operators, which, though possible in principle, cannot be carried out by any known apparatus.”

Unfortunately, in all the problems studied by Helstrom, the improvements predicted by his theory seemed modest at best, rendering the question of quantum limits academic. Quantum opticians turned their attention to nonclassical light sources Kolobov (1999); Dowling (2008); Demkowicz-Dobrzański et al. (2015); Taylor and Bowen (2016); Pirandola et al. (2018); Moreau et al. (2019); Fabre and Treps (2019), while classical opticians turned their attention to near-field microscopy Betzig (2015); Pendry (2004), fluorescence control Betzig (2015); Moerner (2015); Hell (2015), and computational imaging de Villiers and Pike (2016). Helstrom’s work on incoherent imaging was all but forgotten.

Surprise came a few decades later. Applying quantum estimation theory to the problem of resolving two incoherent point sources, we recently discovered that substantial improvements via novel far-field measurements are indeed possible Tsang et al. (2016a). The theory has since been generalized for an arbitrary number of sources Tsang (2017, 2018a); Dutton et al. (2019); Tsang (2019a); Zhou and Jiang (2019); Tsang (2019b); Bonsma-Fisher et al. (2019). The implication is that, even for astronomy, where the sources are inaccessible, the new techniques can enhance the resolution beyond the limits of direct imaging—the de facto method developed by evolution for eons and honed by opticians for centuries. I present in the following an introduction to the breakthrough in Ref. Tsang et al. (2016a), as well as the rapid theoretical Tsang (2017, 2018a); Dutton et al. (2019); Tsang (2019a); Zhou and Jiang (2019); Tsang (2019b); Nair and Tsang (2016a); Tsang et al. (2016b); Nair and Tsang (2016b); Lupo and Pirandola (2016); Tsang (2018b); Ang et al. (2017); Lu et al. (2018); Řeháček et al. (2017); Yang et al. (2017); Kerviche et al. (2017); Chrostowski et al. (2017); Řeháček et al. (2017, 2018); Backlund et al. (2018); Napoli et al. (2019); Yu and Prasad (2018); Prasad and Yu (2019); Prasad (2019); Larson and Saleh (2018); Tsang and Nair (2019); Larson and Saleh (2019); Bonsma-Fisher et al. (2019); Grace et al. (2019); Bisketzi et al. (2019); Lupo et al. (2019); Lee and Ashok (2019); Gefen et al. (2019); Hradil et al. (2019); Len et al. (2020); Lupo (2020) and experimental Tang et al. (2016); Tham et al. (2017); Paúr et al. (2016); Yang et al. (2016); Donohue et al. (2018); Parniak et al. (2018); Paúr et al. (2018); Hassett et al. (2018); Zhou et al. (2019b); Paúr et al. (2019); Wadood et al. (2019); Řeháček et al. (2019) advances that followed.

III Rayleigh’s curse

With two incoherent point sources, direct imaging, and photon shot noise, many studies have shown that their separation becomes harder to estimate if they violate Rayleigh’s criterion Falconi (1967); Tsai and Dunn (1979); Bettens et al. (1999); Van Aert et al. (2002); Ram et al. (2006). The central tool used in those studies is the Fisher information, which sets general lower bounds called Cramér-Rao bounds on the parameter-estimation error Lehmann and Casella (1998). The simplest Cramér-Rao bound (CRB) is

[TABLE]

where MSE is the mean-square error of any unbiased estimator, $\theta$ is the unknown parameter, and $\textrm{FI}(\theta)$ is the Fisher information; see Appendix A for precise definitions. The error can reach the Cramér-Rao bound in many situations, including an asymptotic limit where the sample size approaches infinity, the noise can be approximated as additive and Gaussian, and the maximum-likelihood estimator is used Lehmann and Casella (1998). Thus, the Fisher information is a useful measure of the sensitivity of the experiment to the unknown parameter.

Assume one-dimensional paraxial imaging Goodman (2004) for simplicity, as illustrated by Fig. 2, and Poisson noise, which is an excellent approximation for both optical astronomy Feigelson and Babu (2012); Zmuidzinas (2003); Goodman (1985) and fluorescence microscopy Pawley (2006). The Fisher information becomes

[TABLE]

where $\theta$ here is the separation, $N$ is the average photon number, and $C(\theta)$ is an $N$ -independent prefactor that varies with $\theta$ . $\theta$ and $C(\theta)$ are dimensionless if $\theta$ is normalized in Airy units (1 Airy unit is roughly $\lambda/\textrm{N.A.}$ where $\lambda$ is the wavelength and N.A. is the numerical aperture, or $\lambda/D$ for angular resolution, where $D$ is the aperture diameter Pawley (2006)). Equation (2) was earlier suggested by many as a fundamental measure of resolution for incoherent imaging Tsai and Dunn (1979); Bettens et al. (1999); Van Aert et al. (2002); Ram et al. (2006).

The details of $C(\theta)$ depend on the point-spread function, but the general behavior is as follows: If the sources are well separated relative to Rayleigh’s criterion ( $\theta\gg 1$ ), $C(\theta)$ is relatively constant, but when $\theta$ is close to Rayleigh’s criterion or starts to violate it ( $\theta\lesssim 1$ ), $C(\theta)$ decays to zero, causing the Cramér-Rao bound to blow up as $\theta\to 0$ . In other words, there is a progressive penalty on the Fisher information for the violation of Rayleigh’s criterion, as illustrated by Fig. 3 for a Gaussian point-spread function. In Ref. Tsang et al. (2016a), we called this penalty Rayleigh’s curse to distinguish it from Rayleigh’s criterion—sub-Rayleigh sources are resolvable, but the more they violate Rayleigh’s criterion, the harder it gets to estimate their separation.

IV Dispelling Rayleigh’s curse

Rayleigh’s curse happens if we measure the intensity on the image plane, but what if we allow any quantum measurement that may be sensitive to the phase as well? To find the quantum limit, we can use a quantum version of the Fisher information proposed by Helstrom Helstrom (1976), which sets an upper bound on the Fisher information for any measurement Nagaoka (1989); Braunstein and Caves (1994), as elaborated in Appendix B. We found that the Helstrom information (HI) for the separation estimation problem is given by Tsang et al. (2016a)

[TABLE]

Remarkably, $\textrm{HI}(\theta)$ is constant regardless of the separation and completely free of Rayleigh’s curse, as plotted in Fig. 3.

The constant Helstrom information would be no surprise if it were simply a loose upper bound; the million-dollar question is whether one can find a measurement that attains the limit. Mathematical studies following Helstrom’s work have shown in general that a quantum-limited measurement should exist, at least in the limit of infinite sample size Hayashi (2005); Fujiwara (2006). The mathematics offers little clue to the experimental implementation, however, and finding one in quantum estimation theory is often a matter of educated guessing.

Luckily we found one. Assuming a Gaussian point-spread function, we found that sorting the light on the image plane in terms of the Hermite-Gaussian modes, followed by photon counting in each mode, can lead to a Fisher information given by Tsang et al. (2016a)

[TABLE]

which attains the quantum limit and is free of Rayleigh’s curse for all $\theta$ . Figure 4 illustrates the setup. We called the measurement spatial-mode demultiplexing with the acronym SPADE, to follow the convention of giving catchy acronyms to superresolution methods Moerner (2015). Numerical simulations have shown that SPADE combined with a judicious estimator can give an error very close to the quantum bound $1/\textrm{HI}$ and substantially lower than that achievable by direct imaging Tsang et al. (2016a); Tsang (2018b). Further studies have proposed measurements that work for other point-spread functions Tsang et al. (2016a); Nair and Tsang (2016a); Řeháček et al. (2017); Kerviche et al. (2017).

V How SPADE works

To understand how SPADE can beat direct imaging and achieve the quantum limit, it is helpful to consider a simplified model of thermal light Tsang et al. (2016a) that is valid for optical frequencies and beyond, as described in the following. The model may sound heuristic, but it is possible to derive it from a quantum formalism by assuming a thermal quantum state Mandel and Wolf (1995), the paraxial optics model Yuen and Shapiro (1978), and an “ultraviolet” limit, as elaborated in Appendix C.

Treat each photon on the image plane as a quantum particle with wavefunction $\psi(x)$ , where $x$ is the image-plane coordinate normalized with respect to the magnification factor Goodman (2004). Direct imaging corresponds to a measurement of its position, obeying the probability density

[TABLE]

by virtue of Born’s rule. It is also possible to measure the particle in any other orthonormal basis $\{\phi_{q}(x):q\in\mathbb{N}_{0}\}$ , and the probability of finding the photon in the $q$ th spatial mode is

[TABLE]

For incoherent imaging, the wavefunction of each photon is $\psi(x-X)$ , where $\psi$ is determined by the point-spread function of a diffraction-limited imaging system and the displacement $X$ depends on the position of the point source that emits the photon. Denoting the density of the incoherent sources as $F(X)$ , $X$ can be regarded as a random variable with $F(X)$ as its probability density. For direct imaging, the probability density on the image plane becomes

[TABLE]

which agrees with the classical theory of incoherent imaging Goodman (2004). In general, the probability of finding the photon in the $\phi_{q}(x)$ mode is

[TABLE]

If we treat the arrivals of the photons at the spatial modes as a temporal Poisson process, then the photon counts integrated over time are independent Poisson random variables, each with mean and variance given by $Ng_{q}$ , where $N$ is the average photon number in all modes. For direct imaging, the photon statistics should be treated as a spatial Poisson process with mean intensity $Nf(x)$ Snyder and Miller (1991).

Consider two point sources, one at $X=-\theta/2$ and one at $X=\theta/2$ such that $F(X)=[\delta(X-\theta/2)+\delta(X+\theta/2)]/2$ . If their separation is deeply sub-Rayleigh ( $\theta\ll 1$ ), the wavefunctions can be approximated as

[TABLE]

as depicted by Fig. 5. If $\psi(x)$ is even, $\partial\psi(x)/\partial x$ is odd, and they can be regarded as two orthogonal modes. To the first order, the mean photon count in the fundamental $\psi(x)$ mode is insensitive to the parameter $\theta$ , while the mean count in the derivative mode is the incoherent sum of the contributions from the two sources, or $\propto(\theta/2)^{2}+(-\theta/2)^{2}=\theta^{2}/2$ . If the sources were coherent and in-phase instead, their contributions to the derivative mode would cancel each other, leading to a much reduced signal Tsang (2015). In other words, the incoherence plays a key role in retaining a significant signal in the first order, and SPADE can extract this signal by measuring the derivative mode.

Another reason that SPADE can outperform direct imaging has to do with the fundamental mode $\psi(x)$ . It contains little signal, but it overlaps spatially with the derivative mode and contributes a background to the spatial intensity measured by direct imaging, increasing the variances of the photon counts at each pixel. By projecting the fundamental mode into a different channel, SPADE filters out this background noise and substantially improves the signal-to-noise ratio.

The heuristic discussion so far can be made more rigorous by considering the Fisher information and the Cramér-Rao bounds. Assume that the object distribution $F(X|\theta)$ and therefore $f(x|\theta)$ and $g_{q}(\theta)$ depend on $\theta$ . For the spatial Poisson process from direct imaging, the Fisher information is Snyder and Miller (1991)

[TABLE]

For separation estimation with $\theta\ll 1$ ,

[TABLE]

The denominator in Eq. (10) approaches $|\psi(x)|^{2}$ as $\theta\to 0$ , meaning that the fundamental mode is the major noise contributor, and the Fisher information approaches zero as $\theta\to 0$ . For discrete Poisson variables on the other hand, the Fisher information is

[TABLE]

For separation estimation, as long as $\phi_{1}(x)$ is orthogonal to $\psi(x)$ and has significant overlap with the derivative mode, $g_{1}(\theta)\propto\theta^{2}$ for $\theta\ll 1$ , leading to a nonzero $[\partial g_{1}(\theta)/\partial\theta]^{2}/g_{1}(\theta)$ as $\theta\to 0$ .

To summarize, SPADE relies on the subtle interplay between the coherence induced by diffraction, the incoherence of the sources, and the signal-dependent nature of photon shot noise. It would have been difficult to discover such a fortuitous possibility via conventional wisdom alone, but quantum estimation theory—and quantum information theory in general—have the advantage of being oblivious to conventional wisdom. The mathematics may look daunting, but it can sometimes give rise to new physics beyond our imagination.

VI Implementations of SPADE

To implement SPADE, different spatial modes should be coupled into physically separate channels before detection. This in principle requires only linear optics Morizur et al. (2010), but the most efficient implementation remains unclear. Many methods have been proposed and demonstrated, particularly for the purpose of mode-division multiplexing in optical communication Fabre and Treps (2019). Here I highlight a few methods that have been experimentally demonstrated for the two-point resolution problem.

VI.1 Interferometry

Nair proposed an interferometer called SLIVER (superlocalization via image-inversion interferometry) that can in principle achieve a quantum-limited Fisher information for $\theta\to 0$ and any even point-spread function Nair and Tsang (2016a). Although image-inversion interferometry has earlier been proposed and demonstrated to combat atmospheric turbulence for astronomy Roddier (1988) and to achieve a modest resolution improvement for general confocal microscopy Wicker and Heintzmann (2007); Wicker et al. (2009); Weigel et al. (2011a, b), its extraordinary precision for sub-Rayleigh resolution was hitherto not recognized.

The setup, depicted by Fig. 6, consists of a two-arm interferometer with spatial inversion in one arm. The inversion can be implemented via mirrors, lenses, or a Dove prism for example. As a result of the inversion and the interference at the second beamsplitter, all the even modes on the image plane are routed to one output port while the odd modes are routed to the other port. Hence, the fundamental mode $\psi(x)$ , as long as it is even, is separated from the odd derivative mode, which is detected at the other port. Tang, Durak, and Ling reported a proof-of-concept demonstration of SLIVER Tang et al. (2016), although their reported errors were not close to the quantum limit. Larson and coworkers recently reported a common-path configuration of the interferometer that may be more stable Larson et al. (2019).

SLIVER works best for sub-Rayleigh separations but is suboptimal for larger separations. A variant of SLIVER called pix-SLIVER replaces the detectors by detector arrays and can work better for larger separations Nair and Tsang (2016b). Another way to generalize SLIVER is to think of image inversion as a special case of fractional Fourier transform (FRFT). A tree of FRFT interferometers, with the image-inversion interferometer at its root, can sort the Hermite-Gaussian modes and implement SPADE Xue et al. (2001). The interferometer-tree concept can be generalized to sort in any other basis if appropriate mode-dependent phases can be introduced Abouraddy et al. (2012); Martin et al. (2017).

Along this direction, Hassett and coworkers demonstrated a Michelson interferometer with variable FRFT in one arm and used it to infer the Hermite-Gaussian-mode spectrum $g_{q}$ of a shifted Gaussian beam Hassett et al. (2018). They suggested that the setup could be useful for estimating sub-Rayleigh separations, although its statistical performance remains to be studied. In another work, Zhou and coworkers demonstrated a binary radial-mode sorter that is also based on FRFT interferometry and used it to enhance the estimation of the axial separation between two sources Zhou et al. (2019b).

VI.2 SPLICE

Tham, Ferretti, and Steinberg proposed an elegant setup called SPLICE (super-resolved position localization by inversion of coherence along an edge) to capture the derivative mode Tham et al. (2017). SPLICE consists of a phase plate that introduces a $\pi$ phase shift to half of the image plane and a single-mode fiber, as illustrated by Fig. 7. An odd mode on the image plane is thus coupled into the fiber and detected, while all other modes orthogonal to it are rejected by the fiber. Despite the imperfect match between the odd mode and the derivative mode, Tham and coworkers were still able to demonstrate a mean-square error around five times the quantum bound and a significant improvement over direct imaging Tham et al. (2017).

The use of phase plates is, of course, routine in phase-contrast microscopy Goodman (2004); Lohmann et al. (1998), while the use of a half-plane $\pi$ -phase plate specifically also has a long history in coherent imaging Wolter (1950); Lohmann et al. (1998). The important distinctions here are that we are dealing with incoherent sources, the phase plate is placed at the image plane, and there is a fiber that performs judicious spatial-mode selection.

VI.3 Holograms

A hologram is capable of performing a spatial matched filter, and it can be designed such that the diffracted intensities at specific points in the far field are proportional to the modal spectrum $g_{q}$ Goodman (2004); Forbes et al. (2016). The use of such a hologram for separation estimation was demonstrated by Paúr and coworkers Paúr et al. (2016). Their reported mean-square errors were around twice the quantum bound, but it is important to note that they scaled the quantum bound with respect to the diffracted photon number, not the photon number before the hologram, meaning that the result did not take into account the low diffraction efficiency of their hologram. Efficient SPADE is possible with multiple holograms, however Fabre and Treps (2019).

VI.4 Point-spread-function shaping

In the context of direct imaging, the approximation given by Eq. (11) for $\theta\ll 1$ leads to

[TABLE]

It is often assumed Bettens et al. (1999); Van Aert et al. (2002) that this can be approximated by

[TABLE]

which scales quadratically with $\theta$ . This is indeed true if $|\psi(x)|^{2}$ is Gaussian, but it turns out that the integral in Eq. (14) may not converge if $|\psi(x)|^{2}$ has zeros, and one must go back to Eq. (13), which can give a linear scaling of $\textrm{FI}^{({\rm direct})}$ with $\theta$ instead. Paúr and coworkers exploited this phenomenon by introducing a signum phase mask at the pupil plane of a direct-imaging system, changing $\psi(x)$ from a Gaussian to an odd function with a zero in the middle Paúr et al. (2018). Although the resulting Fisher information still approaches zero for $\theta\to 0$ , they were able to demonstrate a significant improvement of the estimation accuracy with a simple change. Further experiments along the same line for spectroscopy have recently been reported Paúr et al. (2019).

VI.5 Heterodyne

Given the experimental difficulties of performing efficient SPADE, a seemingly appealing alternative is to perform heterodyne detection of the derivative mode by interfering the light with a shaped reference beam on a detector, as demonstrated by Yang and coworkers Yang et al. (2016). It was later found, however, that the homodyne or heterodyne Fisher information still suffers from Rayleigh’s curse for weak thermal light Yang et al. (2017). This can be attributed to the constant vacuum noise that plagues a heterodyne or homodyne detection regardless of the signal, compared with the Poisson variance that reduces with the signal for photon counting. A similar problem was discovered earlier in the context of stellar interferometry Townes (2000); Tsang (2011). The surprisingly poor performance of heterodyne detection demonstrates the importance of analyzing a measurement using rigorous quantum optics as well as statistics, even when dealing with classical light, to ensure an acceptable statistical performance.

VI.6 Sum-frequency generation

Donohue and coworkers implemented SPADE in the time or frequency domain for estimating the separation between optical pulses via an interesting nonlinear-optical technique: sum-frequency generation Donohue et al. (2018). If the light is combined with a strong local-oscillator pulse in a second-order nonlinear medium with the right phase matching, the Hamiltonian of the sum-frequency generation is the same as that of linear optics Eckstein et al. (2011), and a temporal or spectral mode projection can be implemented if the local oscillator has the desired mode shape and the up-converted signal is measured. While the efficiency of their measurement was only 0.7%, the principle was clearly demonstrated in their experiment.

VI.7 Two-photon measurement

Last but not the least, I should mention an even more radical proposal by Parniak and coworkers, which uses a two-photon measurement to estimate the centroid and the separation of two sources simultaneously near the quantum limit Parniak et al. (2018). Its applicability to usual light sources is questionable, but it demonstrates the fact that our model of linear optics and Poisson statistics does not encompass all the possibilities offered by quantum mechanics, and there exist multiphoton measurements that can offer advantages in multiparameter estimation, at least in principle.

VII Extended sources

VII.1 Estimation of the second moment

While the two-point problem is historic and significant, it has rather limited applications, and the important next step is to apply the concepts developed so far to more general objects. Suppose now that the number of point sources is arbitrary, and the object intensity is given in general by $F(X)$ . Similar to the sub-Rayleigh approximation earlier, here I focus on a subdiffraction regime where the object width around $X=0$ , defined as $\Delta$ , is much smaller than the width of the point-spread function, or $\Delta\ll 1$ . Otherwise, $F(X)$ is assumed to be unknown to the experimenter. Similar to Eq. (9), the photon wavefunction due to each point $X$ within the object can be approximated as

[TABLE]

Summing the incoherent contributions from all the points via Eq. (8), the mean photon count in the derivative mode $\phi_{1}(x)\propto\partial\psi(x)/\partial x$ is

[TABLE]

where $c_{1}$ is a constant and $\int_{-\infty}^{\infty}dXX^{2}F(X)$ is the second moment of $F(X)$ . Figure 8 illustrates this concept for multiple point sources. Thus we can expect SPADE to enhance the estimation of the second moment for any subdiffraction object in the same way it enhances the two-point resolution. As the second moment can be related to the width of $F(X)$ , it should not be surprising that SPADE can also enhance the estimation of the object size Tsang (2017); Dutton et al. (2019).

VII.2 Even moments

To go another step further, let us expand $\psi(x-X)$ up to the $q$ th order. It is more convenient to work in the spatial frequency domain, as defined by

[TABLE]

which leads to

[TABLE]

A natural orthonormal basis that includes the fundamental mode $\psi(x)\to\Psi(k)$ and the derivative mode $-\partial\psi(x)/\partial x\to-ik\Psi(k)$ can be defined as Řeháček et al. (2017)

[TABLE]

where $\{b_{q}(k)\}$ are the orthogonal polynomials obtained by applying the Gram-Schmidt process Debnath and Mikusiński (2005) to monomials $\{1,k,k^{2},\dots\}$ with respect to the weighted inner product Dunkl and Xu (2014)

[TABLE]

leading to $\langle b_{q}(k),b_{p}(k)\rangle=\int_{-\infty}^{\infty}dk\Phi_{q}^{*}(k)\Phi_{p}(k)=\delta_{qp}$ . Appendix D gives a brief review of the Gram-Schmidt process. The basis $\{\phi_{q}(x)\}$ is called the point-spread-function-adapted basis Řeháček et al. (2017), or the PAD basis for short Tsang (2018a). For example, if $|\Psi(k)|^{2}$ is Gaussian, then $\{b_{q}(k)\}$ are the Hermite polynomials. An important property of $b_{q}(k)$ that follows from the Gram-Schmidt process is that $\langle b_{q}(k),k^{p}\rangle=0$ if $p<q$ . The overlap function in Eq. (8) becomes

[TABLE]

where $c_{q}$ is a real constant. In other words, $\Phi_{q}(k)$ is orthogonal to all the terms in Eq. (18) except the last $q$ th-order term (and the neglected higher-order terms). The mean photon count given by Eq. (8) becomes

[TABLE]

Similar to the relation between the derivative mode and the second moment, each PAD mode can access an even moment while rejecting the background noise from all the lower moments Tsang (2018a). Hence, SPADE with respect to the PAD basis can be expected to enhance the estimation of all even moments.

If $\psi(x)$ is Gaussian, the PAD basis becomes the Hermite-Gaussian basis, and its sensitivity to even moments was noted in Refs. Yang et al. (2016); Tsang (2017). The general PAD basis was proposed in Refs. Řeháček et al. (2017); Kerviche et al. (2017) for the two-point problem and applied to general imaging in Refs. Tsang (2018a); Zhou and Jiang (2019); Tsang (2019b). The use of SPLICE for moment estimation was recently proposed by Bonsma-Fisher and coworkers Bonsma-Fisher et al. (2019).

VII.3 Error analysis

Define the moment parameters as

[TABLE]

where $\mu\in\mathbb{N}$ denotes the moment order. Appendix E introduces the multiparameter-estimation theory in more detail. The mean and variance of the photon count $n_{q}$ in each PAD mode is

[TABLE]

so the estimator $\check{\theta}_{2q}=n_{q}/(Nc_{q}^{2})$ is approximately unbiased, and the mean-square error is Tsang (2017, 2018a, 2019b)

[TABLE]

where the subscript $2q$ denotes the error for the $\theta_{2q}$ parameter, the big-O notation denotes terms on the order of the argument, and $\theta_{\mu}=O(\Delta^{\mu})$ . For direct imaging on the other hand, the Cramér-Rao bound for any moment is Tsang (2017, 2018a, 2019b)

[TABLE]

so SPADE can achieve much lower errors for the even moments in the $\Delta\ll 1$ subdiffraction regime. The exact Cramér-Rao bounds for both SPADE and direct imaging, as well as the unbiased estimators to achieve them, have been derived recently in Ref. Tsang (2019b) via semiparametric methods and are consistent with the approximate results here.

As large as the enhancement seems, the signal-to-noise ratio (SNR), defined as

[TABLE]

offers a more sobering perspective, as the signal $\theta_{\mu}^{2}=O(\Delta^{2\mu})$ is an even smaller number. For SPADE and even moments, the SNR turns out to be equal to the mean photon count in a PAD mode, or

[TABLE]

which decreases for smaller $\Delta$ and higher moments. The degradation of the SNR can be attributed to the inherently low efficiency of a subdiffraction source coupling into a higher-order mode. While this shows that SPADE has its own limitations, the fact remains that direct imaging is even worse, with a SNR given by

[TABLE]

which is $NO(\Delta^{4q})$ for $\mu=2q$ . With enough photons, the enhancements offered by SPADE can still be useful, especially for the lower moments.

VII.4 Odd moments

To estimate an odd moment, consider projections into the pair of so-called iPAD modes

[TABLE]

which result from the interference of two adjacent PAD modes Tsang (2018a). It makes intuitive sense that, if each $\phi_{q}$ mode is sensitive to the $2q$ th moment, then a superposition of two adjacent PAD modes should be sensitive to an odd moment in-between. Expanding $\psi(x-X)$ up to the $(q+1)$ th order and following the same steps as Eqs. (21) and (22), the overlap function becomes

[TABLE]

where $|\Psi(k)|^{2}$ is assumed to be even such that $\{b_{q}(k)\}$ are alternatively even and odd, leading to $\langle b_{q}(k),k^{q+1}\rangle=0$ . Let the output counts be $n_{q}^{(\pm)}$ . The mean counts are

[TABLE]

Subtracting one count by the other, the mean is

[TABLE]

so an estimator of the odd moment $\theta_{2q+1}$ can be constructed as $\check{\theta}_{2q+1}=(n_{q}^{(+)}-n_{q}^{(-)})/(2Nc_{q}c_{q+1})$ . The variance of $n_{q}^{(+)}-n_{q}^{(-)}$ is $N(g_{q}^{(+)}+g_{q}^{(-)})\approx N(c_{q}^{2}\theta_{2q}+c_{q+1}^{2}\theta_{2q+2})$ , so the mean-square error becomes Tsang (2017, 2018a)

[TABLE]

and the SNR becomes

[TABLE]

For the first moment ( $q=0$ ), the error is the same as the well known $O(1)/N$ error for point-source localization Farrell (1966); Deschout et al. (2014). For the third and higher moments, however, there is significant enhancement over direct imaging. Note also that $n_{q}^{(+)}+n_{q}^{(-)}$ can give information about the even moments as well.

VII.5 Fourier object analysis via moments

The moments can be used in a (generalized) Fourier analysis that may be more familiar to opticians de Villiers and Pike (2016). Suppose that $F(X)$ can be expanded as

[TABLE]

where $G(X)$ is a nonnegative reference density, $\{h_{\mu}(X)=\sum_{\nu=0}^{\mu}H_{\mu\nu}X^{\nu}:\mu\in\mathbb{N}_{0}\}$ are orthogonal polynomials that satisfy

[TABLE]

and $\{\tilde{F}_{\mu}\}$ are generalized Fourier coefficients. Each $h_{\mu}(X)$ has $\mu$ distinct zeros on the support of $G(X)$ Dunkl and Xu (2014), so each $h_{\mu}(X)G(X)$ can be regarded as a wavelet that exhibits localized oscillations. The Fourier coefficients can be expressed as

[TABLE]

In other words, each Fourier coefficient of order $\mu$ can be reconstructed from moments up to order $\mu$ . Thus the number of accurately estimated moments can be regarded as a measure of resolution, and SPADE can help by bringing in more accurate moments and increasing the number of obtainable Fourier coefficients for a subdiffraction object.

With a finite number of moments or Fourier coefficients and no other prior information, the reconstruction of $F(X)$ is ill-posed and requires regularization de Villiers and Pike (2016). Many linear or nonlinear algorithms can be used, depending on the application de Villiers and Pike (2016).

VII.6 Quantum limits

Through the Helstrom information, we have learned earlier that SPADE is optimal for estimating the separation of two point sources. References Helstrom (1970); Tsang (2017) show that direct imaging is close to optimal for locating a subdiffraction object with a known shape, while Ref. Tsang (2017) also shows that SPADE is close to optimal for estimating its size. Generalizing such results for arbitrary moments is much more difficult, as there are now an infinite number of parameters and an infinite number of spatial modes. Zhou and Jiang Zhou and Jiang (2019) showed essentially that any measurement should give a Fisher information that scales with $\Delta$ as

[TABLE]

where $\mu_{1}$ is an integer. With the Cramér-Rao bound $\textrm{MSE}_{\mu}\geq 1/\textrm{FI}_{\mu}$ , the SNR should scale as

[TABLE]

where $\mu_{2}$ is another integer. This means that, for a given $\mu$ , the SNR must decrease for smaller $\Delta$ , and the decrease is faster for higher $\mu$ . The best scaling with $\Delta$ is achieved at $\mu_{1}=\mu_{2}=\mu$ , matching the scaling of the SPADE error given by Eq. (26) for the even moments. Zhou and Jiang did not provide a tractable bound on the prefactor of Eq. (40), however, so it remains a question whether SPADE is at all close to the quantum limit in absolute terms, or there may yet be superior measurements.

Using more standard quantum estimation theory, Ref. Tsang (2019a) proves a quantum limit given by

[TABLE]

where $\textrm{HI}^{\prime}$ is an absolute limit that does not depend on the measurement and can be approximated analytically or numerically. The scaling of $1/\textrm{HI}^{\prime}_{\mu}$ with $\Delta$ matches the errors of SPADE given by Eqs. (26) and (35), suggesting that SPADE is close to quantum-optimal for both even and odd moments, but a more quantitative comparison of the quantum limit with the SPADE performance remains to be done. A limit on the SNR is

[TABLE]

For a given subdiffraction object, Ref. Tsang (2019a) also shows that $\theta_{\mu}^{2}\textrm{HI}^{\prime}_{\mu}$ must decay quickly with higher $\mu$ , meaning that higher moments are fundamentally more difficult to estimate.

VIII Other generalizations

VIII.1 Unknown centroid

A crucial assumption in the preceding discussion is that the object is highly concentrated near a known coordinate $X=0$ , and the SPADE device is ideally aligned with $X=0$ . To put it the other way, $\Delta$ should be regarded as the object width plus any misalignment of SPADE with the object centroid, and misalignment can reduce the enhancement by increasing the effective $\Delta$ . As direct imaging can locate the centroid accurately, the misalignment can be minimized if the object of interest has been imaged before and its centroid is already known accurately, as is often the case in astronomy. Otherwise, some overhead photons should be used to locate the centroid first. Grace and coworkers found that, despite the overhead, SPADE can still offer significant enhancements of the two-point resolution over direct imaging with the same total photon number Grace et al. (2019).

In principle, it turns out to be possible to estimate the centroid and the separation simultaneously at the quantum limit if a multiphoton measurement is performed, as demonstrated by Parniak and coworkers Parniak et al. (2018); Chrostowski et al. (2017), but the applicability of their measurement to usual light sources is questionable.

VIII.2 Strong thermal light

While the model of weak thermal light and Poisson statistics works well for astronomical or fluorescent sources at optical frequencies, thermal sources at lower frequencies or scattered laser sources can exhibit super-Poisson statistics Mandel and Wolf (1995). Nair computed the Helstrom information for separation estimation with the exact thermal state and also proposed variations of SPADE and SLIVER to approach it Nair and Tsang (2016b). Lupo and Pirandola computed the quantum limit for the same problem but assumed arbitrary quantum states, including the thermal state as a special case Lupo and Pirandola (2016). Yang and coworkers studied the use of mode homodyne or heterodyne detection for the two-point problem and found that, although it is not competitive for weak thermal light, it can offer an enhancement over direct imaging for strong thermal light Yang et al. (2017).

For radio and microwave frequencies, photon shot noise is negligible at typical temperatures, and heterodyne detection in any spatial-mode basis is quantum-optimal in the low-frequency limit (Tsang, 2019a, Appendix A2). As amplitude measurements via antennas are already the standard detection method there and they are usually contaminated with substantial excess noise, the ideas here are not relevant to those frequencies unfortunately.

VIII.3 Two point sources with unequal brightnesses

Řeháček and coworkers studied the quantum limits and the optimal measurements for two point sources with unequal brightnesses Řeháček et al. (2017, 2018). They found that, while significant enhancements over direct imaging remain possible, the performance gets worse for unequal sources. In hindsight, this is perhaps not surprising, as moments up to the third are needed to fully parametrize unequal sources and the SNR for the third moment is fundamentally poorer. The use of SPLICE for this case was also studied by Bonsma-Fisher and coworkers Bonsma-Fisher et al. (2019), while the three-dimensional case was recently studied by Prasad Prasad (2019).

VIII.4 More than two point sources

Bisketzi and coworkers Bisketzi et al. (2019) and Lupo, Huang, and Kok Lupo et al. (2019) recently proposed methods to compute the quantum limit to the localization of more than two point sources. Bisketzi and coworkers found numerically that, regardless of the number of sources, the Helstrom information matrix retains only two nonzero eigenvalues as the source separations approach zero. This result is complementary to—and consistent with—existing results on moment estimation Tsang (2019a); Zhou and Jiang (2019), demonstrating the harsh quantum limits to imaging beyond centroid and size estimation. As the location parameters they considered are related nonlinearly to the moment parameters, the Helstrom information matrix transforms in a nontrivial way Hayashi (2017), and a more quantitative comparison of Ref. Bisketzi et al. (2019) with Refs. Tsang (2019a); Zhou and Jiang (2019) will require further effort.

Lupo and coworkers also studied the achievability of the general quantum limit via interferometers Lupo et al. (2019). More work remains to be done to ascertain whether their proposed interferometer design can be implemented without knowing the unknown parameters.

VIII.5 Excess detector noise

If the detectors are contaminated with excess noise besides photon shot noise, the estimation performance necessarily suffers. Len and coworkers studied the Fisher information of SPADE in the presence of such noise Len et al. (2020), while Lupo studied the quantum limits Lupo (2020). A fair comparison of these results with noisy direct imaging remains to be done, however. Considering that the ideal model of direct imaging assumes an infinitesimal pixel size, an infinite number of pixels, no excess noise, and perfect calibration of all pixels, imperfections in real life may well be even more detrimental to direct imaging.

VIII.6 Partially coherent sources

Larson and Saleh studied the separation estimation problem for two partially coherent sources and suggested that Rayleigh’s curse would recur Larson and Saleh (2018, 2019). Their work has been challenged by Refs. Tsang and Nair (2019); Lee and Ashok (2019); Wadood et al. (2019), however. Reference Tsang and Nair (2019) points out a few problems with Larson and Saleh’s analysis, such as the use of a formula for the Helstrom information that becomes questionable for partially coherent sources. References Tsang and Nair (2019); Lee and Ashok (2019); Wadood et al. (2019) also show that SPADE can overcome the curse as long as the sources are not highly correlated, contrary to Larson and Saleh’s claim. Another interesting work on this topic was done by Hradil and coworkers Hradil et al. (2019), who also used the questionable formula; see Appendix C for details. In any case, the debate is irrelevant to observational astronomy and fluorescence microscopy, where there is no sound reason to doubt the established model of spatially incoherent sources Goodman (1985); Pawley (2006).

VIII.7 Two-dimensional imaging

Although I have so far focused on imaging in one dimension for pedagogy, the same principles carry over to two dimensions. For two point sources, there are now two parameters for their vectoral separation. The quantum limits for the two parameters are the same as that for the one-dimensional case, and SPADE with respect to the transverse-electromagnetic (TEM) modes or a pair of SLIVER devices can still estimate the vectoral separation near the quantum limit Ang et al. (2017). For extended sources in two dimensions, a generalization of the PAD and iPAD modes have been studied in Refs. Tsang (2017, 2018a); Zhou and Jiang (2019), and quantum limits have been studied in Refs. Tsang (2017); Zhou and Jiang (2019).

VIII.8 Three-dimensional imaging

Reference Tsang (2015) studies quantum limits to the three-dimensional localization of one point source as well as two coherent sources using the full vectoral electromagnetics model (the discussion of incoherent sources there is flawed and superseded by Ref. Tsang et al. (2016a)). In the context of the paraxial model on the other hand, the axial dimension requires special treatment Goodman (2004). For the axial localization of one point source, Řeháček and coworkers demonstrated that direct imaging with a judicious defocus, a common technique in localization microscopy von Diezmann et al. (2017); Zhou et al. (2019a), can attain the quantum limit Řeháček et al. (2019). Backlund, Shechtman, and Walsworth computed the quantum limit to the three-dimensional localization of a point source using a scalar wave model and proposed special interferometers to achieve it Backlund et al. (2018). Yu and Prasad Yu and Prasad (2018); Prasad and Yu (2019); Prasad (2019) and Napoli and coworkers Napoli et al. (2019) studied the same problem but for two incoherent sources. Zhou and coworkers recently demonstrated a FRFT interferometer to enhance the estimation of the axial separation between two sources Zhou et al. (2019b).

VIII.9 Spectroscopy

Donohue and coworkers demonstrated mode-selective measurements to enhance time and frequency estimation for incoherent optical pulses Donohue et al. (2018). On a more mathematical level, the quantum model of a photon from incoherent sources coincides with that of a quantum probe subject to random displacements, as pointed out by Ref. Tsang (2019a), so noise spectroscopy with optomechanics or spin ensembles is another potential application of the theory Ng et al. (2016); Gefen et al. (2019).

VIII.10 Biased estimators

The simplest form of the Cramér-Rao bound is applicable to unbiased estimators only, and it turns out that biased estimators may violate it significantly Lehmann and Casella (1998). For example, the Cramér-Rao bound for separation estimation with direct imaging blows up to infinity as $\theta\to 0$ , but the maximum-likelihood estimator, being biased for this problem, can still achieve a finite error for all $\theta$ Huang et al. (2011); Tham et al. (2017); Tang et al. (2016). For SPADE, the maximum-likelihood estimator can also violate the Cramér-Rao bound and give a vanishing error as $\theta\to 0$ Tsang et al. (2016a). Given these violations, one may wonder if the Cramér-Rao bound is meaningful outside the theoretical construct of asymptotic statistics Lehmann and Casella (1998) after all. The loophole can be fixed by using a Bayesian version of the Cramér-Rao bound Van Trees (2001) that is valid for any biased or unbiased estimator. Reference Tsang (2018b) shows that, from the Bayesian and minimax perspectives, there remains a significant performance gap between direct imaging and SPADE for separation estimation, even if biased estimators are permitted.

VIII.11 One-versus-two hypothesis testing

Another way of defining the two-point resolution is to consider the error probabilities of deciding whether there is one point source or two point sources with the same total brightness. As mentioned in Sec. II, Helstrom performed a pioneering study of this problem using his quantum detection theory Helstrom (1973), but his proposed measurement depends on the separation in the two-source hypothesis, he did not suggest any experimental setup to realize it, and he did not show how much improvement it could offer. In the context of direct imaging, the problem was also studied in Refs. Harris (1964); Acuna and Horowitz (1997); Shahram and Milanfar (2004, 2006).

Coming in full circle, Lu and coworkers recently showed that the quantum limit to the hypothesis-testing problem is indeed a substantial improvement over direct imaging, and both SPADE and SLIVER can reach the quantum limit in the sub-Rayleigh regime, without knowing the separation in advance Lu et al. (2018).

IX Comparison

with other imaging techniques

In the wider context of imaging research, SPADE is but one of the countless superresolution proposals in the literature. It nonetheless possesses many unique advantages and avoids some common pitfalls of prior ideas, thanks to its firm footing in quantum optics and statistics. Its advantages over direct imaging and computational techniques have already been emphasized in previous sections, and here I highlight some other important or popular ideas in imaging and how SPADE compares.

IX.1 Stellar interferometry

SPADE perhaps bears the most resemblance to stellar interferometry Goodman (1985); Labeyrie et al. (2006); Roddier (1988), as they are both examples of applying coherent optical processing to incoherent imaging. In particular, SLIVER resembles the folding and rotation-shearing interferometers in optical astronomy, the only difference being that the former is placed at the image plane and the latter usually at the pupil plane Roddier (1988). Conventional wisdom suggests, however, that the main advantage of stellar interferometry lies in its robustness against atmospheric turbulence Goodman (1985); Labeyrie et al. (2006); Roddier (1988). To quote Goodman Goodman (1985): “The reader may well wonder why the Fizeau stellar interferometer, which uses only a portion of the telescope aperture, is in any way preferred to the full telescope aperture in this task of measuring the angular diameter of a distant object. The answer lies in the effects of the random spatial and temporal fluctuations of the earth’s atmosphere (‘atmospheric seeing’)… It is easier to detect the vanishing of the contrast of a fringe in the presence of atmospheric fluctuations than it is to determine the diameter of an object from its highly blurred image.” Zmuidzinas Zmuidzinas (2003) also suggests that “the imperfect beam patterns of sparse-aperture interferometers extract a sensitivity penalty as compared with filled-aperture telescopes, even after accounting for the differences in collecting areas.” No work before ours recognized that interferometry can outperform direct imaging on statistical terms for diffraction-limited, filled-aperture telescopes.

Another use of stellar interferometry is to increase the baseline by coherently combining light from multiple apertures Labeyrie et al. (2006). Our theory can also be applied to this multi-aperture scenario if we take the optical transfer function $\Psi(k)$ defined by Eq. (17) to be the total aperture function for all the apertures. While conventional interferometer designs call for the interference of light from pairs of apertures Labeyrie et al. (2006) or the mimicking of image-formation optics Labeyrie et al. (2006); Zmuidzinas (2003), our theory offers the novel insight that demultiplexing the light in terms of the PAD or iPAD modes associated with $\Psi(k)$ can bring substantial advantages. This perspective generalizes the recent studies on the quantum optimality of stellar interferometry Tsang (2011); Pearce et al. (2017); Howard et al. (2019); Lupo et al. (2019).

Another idea that sounds similar to SLIVER is nulling interferometry Labeyrie et al. (2006), which was proposed for the specific purpose of exoplanet detection. The idea there is to remove the light from a bright star via destructive interference while leaving the light from a nearby planet intact, but its fundamental statistical performance in the subdiffraction regime has not been studied to our knowledge. It remains open questions whether nulling interferometry or similar ideas in astronomy turn out to perform similarly to SLIVER or SPADE, and how the quantum-inspired techniques and the quantum limits may benefit important astronomical applications in practice, such as exoplanet detection.

IX.2 Multiphoton coincidence

While modern stellar interferometers all rely on amplitude interference Labeyrie et al. (2006), also called $g^{(1)}$ measurements in quantum optics, the intensity interferometer by Hanbury Brown and Twiss—a $g^{(2)}$ measurement—deserves a mention as well, considering that it inspired the foundation of quantum optics Mandel and Wolf (1995) and is still being held in high regard by quantum opticians. In astronomy, however, the intensity interferometer has in fact been obsolete for decades because of its poor SNR Goodman (1985); Labeyrie et al. (2006). It relies on the postselection of two-photon-coincidence events, which are much rarer than the one-photon events used in amplitude interferometry and therefore must give much less information in principle. For example, Davis and Tango reported an amplitude interferometer that obtained similar results to those from the intensity interferometer, using only $\sim$ 2% of the observation time Davis and Tango (1986). For microscopy, the use of multiphoton coincidence has recently been demonstrated in some heroic experiments Genovese (2016); Schneider et al. (2018); Tenne et al. (2019); Berchera and Degiovanni (2019), but again its statistical performance needs to be studied more carefully. SPADE, on the other hand, is a $g^{(1)}$ measurement that relies on the much more abundant one-photon events without the need for coincidence detection and its statistical performance has been proved rigorously.

IX.3 Electron microscopy and near-field microscopy

If the object is on a surface and accessible, then no technique can compete with electron microscopy, atomic force microscopy, and scanning-tunneling microscopy in terms of resolution. Those techniques impose stringent requirements on the sample however, and that is why optical microscopy remains useful, especially for biological imaging, as it is able to image biological samples in a more natural environment and provide protein-specific contrast via fluorophore tagging.

In terms of optics, near-field techniques have not been successful because of the short depth of focus and other technical challenges Betzig (2015). In recent years, the use of plasmonics and metamaterials to enhance the near field Pendry (2004) has also attracted immense interest in the academia, but the requirement of close proximity to the object and the impact of loss remain showstoppers in practice Khurgin (2015).

Being a far-field technique, SPADE is more compatible with biological imaging, not to mention its unique capability for astronomy and remote sensing. Unlike metamaterials, SPADE requires only low-loss optical components and there is no stringent requirement on their feature size, so fabrication is more straightforward.

Given the theoretical similarity between optical imaging and electron microscopy Bettens et al. (1999); Van Aert et al. (2002), the application of SPADE to the latter is possible in principle and indeed tantalizing, but more research concerning its implementation for electrons needs to be done.

IX.4 Superresolution fluorescence microscopy

Far-field superresolution techniques such as PALM and STED have been hugely successful in biological flourescence microscopy Hell (2015); Moerner (2015); Betzig (2015), but many of them rely on sophisticated control of the source emission, which introduces many other problems, such as the need for special fluorophores, slow speed in the case of PALM, and phototoxicity in the case of STED. SPADE, on the other hand, is a passive far-field measurement that can complement or supersede the superresolution techniques by extracting more information from the light or alleviating the need for source control. The combination of SPADE with microscope configurations, such as confocal and structured illumination Pawley (2006), awaits further research.

IX.5 Nonclassical light

The application of nonclassical light to sensing and imaging has been an active research topic in quantum optics for many decades Kolobov (1999); Dowling (2008); Demkowicz-Dobrzański et al. (2015); Taylor and Bowen (2016); Pirandola et al. (2018); Moreau et al. (2019); Fabre and Treps (2019). It is now well known, however, that nonclassical light is extremely fragile against loss and decoherence Demkowicz-Dobrzański et al. (2015); Taylor and Bowen (2016), and any theoretical advantage can be easily lost in practice, not to mention that the efficient generation and detection of nonclassical light remain very challenging. More recent proposals, such as quantum illumination and quantum reading Pirandola et al. (2018), apply to high-noise scenarios, but the achievable improvement turns out to be quite modest even in theory Tan et al. (2008).

As SPADE works with classical light, linear optics, and photon counting, loss and other imperfections are not nearly as detrimental. If we are to believe that the second quantum revolution is near and applications using nonclassical resources will soon be widespread Dowling and Milburn (2003), then SPADE should be an even surer bet.

For astronomy, obviously the light sources cannot be controlled, but the use of entangled photons and quantum repeaters has been proposed to teleport photons in stellar interferometry and increase its baseline Gottesman et al. (2012); Khabiboulline et al. (2019). Unfortunately, quantum repeaters are nowhere near practical yet, and conventional linear optical devices remain the best option in the foreseeable future.

IX.6 Superoscillation, amplification, postselection

There are so many other superresolution ideas that going through them all would not be feasible. I list here only a few more: superoscillation Rogers and Zheludev (2013), amplification Kellerer and Ribak (2016), and postselection Rafsanjani et al. (2017). They either require steep trade-offs with the SNR or have questionable statistics Prasad (1994); Lantz (2017). These examples once again demonstrate the importance of a rigorous analysis using quantum optics and statistics. It is important to keep in mind that superresolution is possible even with direct imaging and data processing, and it is ultimately limited by the SNR de Villiers and Pike (2016). A superresolution technique is viable only if it can beat direct imaging on statistical terms.

X Conclusion

Just as the design of engines must go beyond mechanics and consult thermodynamics, the design of optical sensing and imaging systems must go beyond electromagnetics and consult statistics. With the increasingly dominant role of photon shot noise in modern applications, quantum mechanics is also relevant. Quantum information theory can tackle all these subjects in one unified formalism, setting limits to what we can do, and also telling us how much further we can go. For incoherent imaging, it gives us the pleasant surprise that there is still plenty of room for improvement, and we just need to find a way to achieve it. We found one in the form of SPADE, which requires only low-loss linear optics and photon counting. While we started with the simple model of two point sources, we have since generalized the theory to deal with any subdiffraction object, showing that substantial improvements remain possible. The theoretical groundwork has been laid, proof-of-principle experiments have been done, and applications in astronomy and fluorescence microscopy can now be envisioned. Special-purpose applications that require only the low-order moments, such as two-point resolution and object-size estimation, should be the first to benefit, while more general imaging protocols will require further research.

Many open problems still remain. On the theoretical side, the exact quantum limits to general imaging and the optimal measurements to achieve them remain unclear. The theory for three-dimensional imaging and spectroscopy remains underdeveloped. On the practical side, an efficient implementation of SPADE at the right wavelengths is needed for applications. The performance of SPADE in the presence of atmospheric turbulence and other technical noises also needs to be assessed. Fortunately, adaptive optics Esposito et al. (2011), photodetectors Michalet et al. (2013), and photonics in general have become so good in recent years that we can be optimistic about reaching the quantum limits in the near future.

Acknowledgments

I am grateful to the seminal contributions of the authors of Refs. Tsang et al. (2016a); Tsang (2017, 2018a); Dutton et al. (2019); Tsang (2019a); Zhou and Jiang (2019); Tsang (2019b); Nair and Tsang (2016a); Tsang et al. (2016b); Nair and Tsang (2016b); Lupo and Pirandola (2016); Tsang (2018b); Ang et al. (2017); Lu et al. (2018); Řeháček et al. (2017); Yang et al. (2017); Kerviche et al. (2017); Chrostowski et al. (2017); Řeháček et al. (2017, 2018); Backlund et al. (2018); Napoli et al. (2019); Yu and Prasad (2018); Prasad and Yu (2019); Prasad (2019); Larson and Saleh (2018); Tsang and Nair (2019); Larson and Saleh (2019); Bonsma-Fisher et al. (2019); Grace et al. (2019); Bisketzi et al. (2019); Lupo et al. (2019); Lee and Ashok (2019); Gefen et al. (2019); Hradil et al. (2019); Len et al. (2020); Lupo (2020); Tang et al. (2016); Tham et al. (2017); Paúr et al. (2016); Yang et al. (2016); Donohue et al. (2018); Parniak et al. (2018); Paúr et al. (2018); Hassett et al. (2018); Zhou et al. (2019b); Paúr et al. (2019); Wadood et al. (2019); Řeháček et al. (2019), especially the crucial roles of Ranjith Nair, Xiao-Ming Lu, and Shan Zheng Ang in our early papers. I also acknowledge useful discussions with Luis Sánchez-Soto, Jaroslav Řeháček, Zdeněk Hradil, Saikat Guha, and Cosmo Lupo in the course of writing this manuscript. This work is supported by the Singapore National Research Foundation under Project No. QEP-P7.

Appendix A Cramér-Rao bound

and Fisher information

Let $\{P_{Y}(y|\theta)>0:y\in\Omega,\theta\in\Theta\subseteq\mathbb{R}\}$ be a family of probability distributions for an observed random variable $Y$ , where $\theta$ is an unknown scalar parameter and the support $\Omega$ is assumed to be countable and common to all distributions for simplicity. Let $\check{\theta}(Y)$ be an estimator of $\theta$ . Define the mean-square error as

[TABLE]

where $\mathbb{E}$ denotes the expectation. The unbiased condition is

[TABLE]

Under certain regularity conditions on the distributions, the Cramér-Rao bound given by Eq. (1) holds for any unbiased estimator, where the Fisher information is Lehmann and Casella (1998)

[TABLE]

Generalization for probability densities is straightforward Lehmann and Casella (1998).

Appendix B Helstrom information

Let $\{\rho(\theta):\theta\in\Theta\subseteq\mathbb{R}\}$ be a family of density operators for a quantum object. Under a quantum measurement, the generalized Born’s rule is given by

[TABLE]

where $\operatorname{tr}$ denotes the operator trace and $E_{Y}(y)$ is called the positive operator-valued measure (POVM), which models the measurement statistics Hayashi (2017). Define the Helstrom information as Helstrom (1976)

[TABLE]

where $L$ is a solution to

[TABLE]

For any POVM, Helstrom proved $\textrm{MSE}\geq\textrm{HI}^{-1}$ Helstrom (1976), while Nagaoka Nagaoka (1989) and Braunstein and Caves Braunstein and Caves (1994) proved

[TABLE]

Although they also proved that $\max_{E_{Y}}\textrm{FI}(\theta)=\textrm{HI}(\theta)$ and a projection in the eigenstates of $L$ gives an optimal POVM, it is important to keep in mind that $L$ is a function of $\theta$ , and the optimal POVM derived from it at one value of $\theta$ may be suboptimal at other values. In practice, obviously $\theta$ is unknown, and there is no guarantee that one can find a POVM that is optimal across a range of $\theta$ . A solution, proposed by Nagaoka and refined by Hayashi and Matsumoto Hayashi (2005) and Fujiwara Fujiwara (2006), is to consider repeated adaptive measurements, and they showed that the total Fisher information of such measurements can approach the Helstrom information in the limit of infinitely many measurements under certain technical conditions.

Appendix C Thermal state

in the ultraviolet limit

Consider thermal light in one temporal mode and multiple spatial modes, and let $\{a_{0},a_{1},\dots\}$ be the annihilation operators for the spatial modes. As first proposed by Glauber Glauber (2006), the thermal state is Helstrom (1976)

[TABLE]

where $\alpha=(\alpha_{0},\alpha_{1},\dots)^{\top}$ is a column vector of zero-mean complex Gaussian random variables with probability density $\Phi$ , $\top$ denotes the transpose, $\dagger$ denotes the conjugate transpose, $\ket{\alpha}$ is a multimode coherent state that obeys $a_{q}\ket{\alpha}=\alpha_{q}\ket{\alpha}$ , and $\Gamma$ is the mutual coherence matrix Mandel and Wolf (1995). In particular, the first moments of $\alpha$ are given by

[TABLE]

The photon-counting distribution is

[TABLE]

where $\ket{n}$ is a Fock state and $\ket{{\rm vac}}$ is the vacuum state. Equation (54) agrees with the semiclassical theory by Mandel Mandel and Wolf (1995). With $M$ temporal modes, the density operator can be modeled as $M$ copies of $\sigma$ , or

[TABLE]

To simplify the thermal state for optical frequencies, let

[TABLE]

be the average photon number per temporal mode and

[TABLE]

be the normalized mutual coherence matrix. Define the ultraviolet limit as $\epsilon\to 0$ while holding $N=M\epsilon$ constant. The zero-photon probability per temporal mode is

[TABLE]

the one-photon probability is

[TABLE]

where the diagonal entries of a matrix are abbreviated as $g_{qq}=g_{q}$ , and the probability of two or more photons is $O(\epsilon^{2})$ . The photon counts summed over $M$ temporal modes hence become Poisson in the ultraviolet limit Goodman (1985). A simplified quantum model in this limit is Tsang (2011); Tsang et al. (2016a)

[TABLE]

where the one-photon density operator is

[TABLE]

For paraxial incoherent imaging in particular Tsang (2017),

[TABLE]

where $\hat{k}$ is the spatial-frequency or momentum operator, $\ket{\psi}$ is the one-photon state with spatial wavefunction $\braket{x}{\psi}=\psi(x)$ , and $\ket{x}$ is the one-photon position eigenket that obeys $\braket{x}{x^{\prime}}=\delta(x-x^{\prime})$ . $f(x)=\bra{x}\rho_{1}\ket{x}$ gives Eq. (7), while $g_{q}=\bra{\phi_{q}}\rho_{1}\ket{\phi_{q}}$ gives Eq. (8). If $f$ and $g$ depend on $\theta$ (but $\epsilon$ does not), the Fisher information for the Poisson processes is given by Eqs. (10) and (12).

The ultraviolet limit and the negligence of $O(\epsilon^{2})$ terms mean that multiphoton coincidence events and bunching effects are ignored Goodman (1985). Besides thermal sources, the model here also applies to any incoherent sources, such as fluorescent sources Pawley (2006) or even electrons Bettens et al. (1999); Van Aert et al. (2002), as long as they obey an incoherent-imaging model with Poisson counting statistics.

For the thermal state given by Eqs. (51) and (52), Helstrom showed that Helstrom (1976)

[TABLE]

where $\Upsilon$ is a solution to

[TABLE]

and $I$ is the identity matrix. Reference (Tsang, 2019a, Appendix A) shows that the information given by Eqs. (65) and (66) on a per-photon basis is upper-bounded by its ultraviolet limit, which coincides with the information computed for the one-photon density operator $\rho_{1}$ given by Eq. (63) if $\epsilon$ does not depend on $\theta$ , viz.,

[TABLE]

With $M$ temporal modes, the Helstrom bound is multiplied by $M$ Hayashi (2017), so $\textrm{HI}^{(\rho)}=M\textrm{HI}^{(\sigma)}$ , and the total information in the ultraviolet limit becomes

[TABLE]

which means that $\textrm{HI}^{(\rho_{1})}$ also serves as a limit for thermal states with arbitrary $\epsilon$ if $\epsilon$ does not depend on $\theta$ .

If $\epsilon$ depends on $\theta$ , which may happen with partially coherent sources Tsang and Nair (2019), one must be more careful and go back to Eqs. (65) and (66). For $\epsilon\ll 1$ , $I+\Gamma\approx I$ , and Eq. (66) can be approximated as

[TABLE]

Equations (65) and (69), in terms of the mutual coherence matrix $\Gamma$ , resemble Eqs. (48) and (49) in terms of the density operator $\rho$ . Notice, however, that Eqs. (65) and (69) are in terms of the unnormalized $\Gamma$ . References Larson and Saleh (2018); Hradil et al. (2019), on the other hand, use the normalized version $g=\Gamma/\operatorname{tr}\Gamma$ in the formulas and may have produced unphysical results for partially coherent sources.

Appendix D Gram-Schmidt process

Consider an inner-product space equipped with an inner product $\langle u,v\rangle$ between two elements $u$ and $v$ and a norm $\lVert u\rVert=\sqrt{\langle u,u\rangle}$ . An illustrative example is the space of Euclidean vectors in $\mathbb{R}^{d}$ , with the dot product as the inner product and the vector length as the norm. Given a set of linearly independent elements $S=\{u_{0},u_{1},\dots\}$ , the Gram-Schmidt process produces an orthonormal basis $\{b_{0},b_{1},\dots\}$ for the space spanned by $S$ Debnath and Mikusiński (2005). The process starts with

[TABLE]

Then, for each $q=1,2,\dots$ ,

[TABLE]

$\lVert b_{q}\rVert=\sqrt{\langle b_{q},b_{q}\rangle}=1$ by design. One can check that $v_{q}$ and $b_{q}$ are orthogonal to $\{b_{0},\dots,b_{q-1}\}$ . It follows that $\{b_{0},\dots,b_{q}\}$ is an orthonormal basis with

[TABLE]

Since the space spanned by $\{b_{0},\dots,b_{q-1}\}$ is the same as the space spanned by $\{u_{0},\dots,u_{q-1}\}$ , each $b_{q}$ is also orthogonal to $\{u_{0},\dots,u_{q-1}\}$ .

Appendix E Multiparameter estimation

Now suppose that $\theta\in\Theta\subseteq\mathbb{R}^{K}$ is a column vector of parameters, and the estimator is also a vector. Define the mean-square error covariance matrix as

[TABLE]

Diagonal entries of a matrix are again abbreviated as $\textrm{MSE}_{\mu\mu}=\textrm{MSE}_{\mu}$ . The multiparameter Cramér-Rao bound Lehmann and Casella (1998) can be expressed as the matrix inequality

[TABLE]

The matrix inequality means that $\textrm{MSE}-\textrm{CRB}$ is positive-semidefinite Horn and Johnson (1985), or equivalently $u^{\top}(\textrm{MSE}-\textrm{CRB})u\geq 0$ for any real column vector $u$ . For example, the multiparameter Cramér-Rao bounds for two point sources and more general objects measured with direct imaging and SPADE have been derived in Refs. Tsang et al. (2016a); Ang et al. (2017); Tsang (2017, 2018a, 2019b).

The Helstrom information matrix is defined as

[TABLE]

The matrices can be shown to inherit all the properties of their scalar version by substituting the directional derivative $\partial/\partial\theta=\sum_{\mu}u_{\mu}\partial/\partial\theta_{\mu}$ and $L=\sum_{\mu}u_{\mu}L_{\mu}$ for an arbitrary real vector $u$ . For example, upon the substitutions, the scalar Fisher information becomes $u^{\top}\textrm{FI}u$ and the scalar Helstrom information becomes

[TABLE]

where I have used the fact that, since $u^{\top}\mathcal{H}u$ and $u$ are real, $u^{\top}\mathcal{H}u=\real(\sum_{\mu}u_{\mu}\mathcal{H}_{\mu\nu}u_{\nu})=\sum_{\mu}u_{\mu}\real(\mathcal{H}_{\mu\nu})u_{\nu}=\sum_{\mu}u_{\mu}\textrm{HI}_{\mu\nu}u_{\nu}$ . The Nagaoka bound given by Eq. (50) becomes $u^{\top}\textrm{FI}u\leq u^{\top}\textrm{HI}u$ , meaning that Eq. (50) still holds as a matrix inequality. A consequence of the matrix inequality is that the inverses obey the reverse relation Horn and Johnson (1985), so the Nagaoka bound leads to

[TABLE]

Bibliography141

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Rayleigh (1879) Lord Rayleigh, “XXXI. Investigations in optics, with special reference to the spectroscope,” Philosophical Magazine Series 5 8 , 261–274 (1879) . · doi ↗
2Feynman et al. (2013) Richard P. Feynman, Robert B. Leighton, and Matthew Sands, The Feynman Lectures on Physics , Vol. 1 (California Institute of Technology, Pasadena, 2013) Chap. 30. Diffraction.
3Mandel and Wolf (1995) Leonard Mandel and Emil Wolf, Optical Coherence and Quantum Optics (Cambridge University Press, Cambridge, 1995). · doi ↗
4den Dekker and van den Bos (1997) A. J. den Dekker and A. van den Bos, “Resolution: a survey,” Journal of the Optical Society of America A 14 , 547–557 (1997) . · doi ↗
5de Villiers and Pike (2016) Geoffrey de Villiers and E. Roy Pike, The Limits of Resolution (CRC Press, Boca Raton, 2016). · doi ↗
6Falconi (1967) Oscar Falconi, “Limits to which Double Lines, Double Stars, and Disks can be Resolved and Measured,” J. Opt. Soc. Am. 57 , 987–993 (1967) . · doi ↗
7Tsai and Dunn (1979) Ming-Jer Tsai and Keh-Ping Dunn, Performance Limitations on Parameter Estimation of Closely Spaced Optical Targets Using Shot-Noise Detector Model , Tech. Rep. ADA 073462 (Lincoln Laboratory, MIT, 1979).
8Bettens et al. (1999) E. Bettens, D. Van Dyck, A. J. den Dekker, J. Sijbers, and A. van den Bos, “Model-based two-object resolution from observations having counting statistics,” Ultramicroscopy 77 , 37–48 (1999) . · doi ↗