Coronagraphic phase diversity through residual turbulence: performance   study and experimental validation

Olivier Herscovici-Schiller; Jean-Fran\c{c}ois Sauvage; Laurent M.; Mugnier; Kjetil Dohlen; Arthur Vigan

arXiv:1907.07038·astro-ph.IM·July 31, 2019

Coronagraphic phase diversity through residual turbulence: performance study and experimental validation

Olivier Herscovici-Schiller, Jean-Fran\c{c}ois Sauvage, Laurent M., Mugnier, Kjetil Dohlen, Arthur Vigan

PDF

TL;DR

This paper extends coronagraphic phase diversity to estimate quasi-static aberrations amidst residual turbulence, demonstrating through simulations and lab experiments that aberrations can be corrected during observations, improving exoplanet imaging.

Contribution

It introduces an extension of COFFEE for residual turbulence, validated with simulations and laboratory experiments for real-time aberration correction.

Findings

01

Coronagraphic phase diversity can estimate quasi-static aberrations with residual turbulence.

02

Simulations show promising performance on current-generation instruments.

03

Laboratory experiments confirm real-time aberration correction is feasible.

Abstract

Quasi-static aberrations in coronagraphic systems are the ultimate limitation to the capabilities of exoplanet imagers both ground-based and space-based. These aberrations - which can be due to various causes such as optics alignment or moving optical parts during the observing sequence - create light residuals called speckles in the focal plane that might be mistaken for a planets. For ground-based instruments, the presence of residual turbulent wavefront errors due to partial adaptive optics correction causes an additional difficulty to the challenge of measuring aberrations in the presence of a coronagraph. In this paper, we present an extension of COFFEE, the coronagraphic phase diversity, to the estimation of quasi-static aberrations in the presence of adaptive optics-corrected residual turbulence. We perform realistic numerical simulations to assess the performance that can be…

Equations53

J (ϕ) = k, x, y \sum \frac{∥ i ( k , x , y ) - m ( ϕ _{up} , ϕ _{down} , k , x , y ) ∥ ^{2}}{2 σ ^{2} ( k , x , y )} + R (ϕ_{up}) + R (ϕ_{down}) .

J (ϕ) = k, x, y \sum \frac{∥ i ( k , x , y ) - m ( ϕ _{up} , ϕ _{down} , k , x , y ) ∥ ^{2}}{2 σ ^{2} ( k , x , y )} + R (ϕ_{up}) + R (ϕ_{down}) .

R (ϕ) = \frac{1}{2 σ _{\nabla ϕ}^{2}} x, y \sum ∥ \nabla ϕ ∥^{2} (x, y),

R (ϕ) = \frac{1}{2 σ _{\nabla ϕ}^{2}} x, y \sum ∥ \nabla ϕ ∥^{2} (x, y),

m (ϕ, k, x, y) = f_{k} \times [h_{det} ⋆ h_{c} (ϕ + ϕ_{k})] (x, y) + b_{k},

m (ϕ, k, x, y) = f_{k} \times [h_{det} ⋆ h_{c} (ϕ + ϕ_{k})] (x, y) + b_{k},

h_{a} (D_{ϕ}) = F^{- 1} [exp (- \frac{1}{2} D_{ϕ})],

h_{a} (D_{ϕ}) = F^{- 1} [exp (- \frac{1}{2} D_{ϕ})],

h_{lec}

h_{lec}

\iint h_{a} (α^{'}; D_{ϕ}) \times h_{c} (α; ϕ_{up} + 2 \uppi α^{'} \cdot Id, ϕ_{down}) d α^{'},

h_{lec}

h_{lec}

α_{x}^{'} = - N /2 \sum N /2 α_{y}^{'} = - N /2 \sum N /2 h_{a} (α^{'}; D_{ϕ}) \times h_{c} (α; ϕ_{up} + 2 \uppi α^{'} \cdot Id, ϕ_{down}) .

h_{lec} (α; ϕ_{up}, ϕ_{down}, D_{ϕ}) =

h_{lec} (α; ϕ_{up}, ϕ_{down}, D_{ϕ}) =

α_{x}^{'} = - N /2 α_{x}^{'} \neq \in [- M /2, M /2] \sum N /2 α_{y}^{'} = - N /2 α_{y}^{'} \neq \in [- M /2, M /2] \sum N /2 h_{a} (α^{'}; D_{ϕ}) \times h_{c} (α; ϕ_{up} + 2 \uppi α^{'} \cdot Id, ϕ_{down})

+ α_{x}^{'} = - M /2 \sum M /2 α_{y}^{'} = - M /2 \sum M /2 h_{a} (α^{'}; D_{ϕ}) \times h_{c} (α; ϕ_{up} + 2 \uppi α^{'} \cdot Id, ϕ_{down}) .

h_{lec} (α; ϕ_{up}, ϕ_{down}, D_{ϕ}) \approx

h_{lec} (α; ϕ_{up}, ϕ_{down}, D_{ϕ}) \approx

α_{x}^{'} = - N /2 α_{x}^{'} \neq \in [- M /2, M /2] \sum N /2 α_{y}^{'} = - N /2 α_{y}^{'} \neq \in [- M /2, M /2] \sum N /2 h_{a} (α^{'}; D_{ϕ}) \times h (α; ϕ_{up} + 2 \uppi α^{'} \cdot Id, ϕ_{down})

+ α_{x}^{'} = - M /2 \sum M /2 α_{y}^{'} = - M /2 \sum M /2 h_{a} (α^{'}; D_{ϕ}) \times h_{c} (α; ϕ_{up} + 2 \uppi α^{'} \cdot Id, ϕ_{down}) .

\displaystyle h_{\mathrm{a}}^{0}(\alpha^{\prime})=\left\{\begin{array}[]{l}0\text{ if }\alpha^{\prime}\in[-M/2,M/2]\times[-M/2,M/2]\\ h_{\mathrm{a}}(\alpha^{\prime})\text{ otherwise.}\end{array}\right.

\displaystyle h_{\mathrm{a}}^{0}(\alpha^{\prime})=\left\{\begin{array}[]{l}0\text{ if }\alpha^{\prime}\in[-M/2,M/2]\times[-M/2,M/2]\\ h_{\mathrm{a}}(\alpha^{\prime})\text{ otherwise.}\end{array}\right.

h_{lec} (α; ϕ_{up}, ϕ_{down}, D_{ϕ}) =

h_{lec} (α; ϕ_{up}, ϕ_{down}, D_{ϕ}) =

α_{x}^{'} = - N /2 \sum N /2 α_{y}^{'} = - N /2 \sum N /2 h_{a}^{0} (α^{'}; D_{ϕ}) \times h (α; ϕ_{up} + 2 \uppi α^{'} \cdot Id, ϕ_{down})

+ α_{x}^{'} = - M /2 \sum M /2 α_{y}^{'} = - M /2 \sum M /2 h_{a} (α^{'}; D_{ϕ}) \times h_{c} (α; ϕ_{up} + 2 \uppi α^{'} \cdot Id, ϕ_{down}),

h_{lec} (α; ϕ_{up}, ϕ_{down}, D_{ϕ}) = h_{a}^{0} (D_{ϕ}) ⋆ h (ϕ_{up}, ϕ_{down}) (α)

h_{lec} (α; ϕ_{up}, ϕ_{down}, D_{ϕ}) = h_{a}^{0} (D_{ϕ}) ⋆ h (ϕ_{up}, ϕ_{down}) (α)

+ α_{x}^{'} = - M /2 \sum M /2 α_{y}^{'} = - M /2 \sum M /2 h_{a} (α^{'}; D_{ϕ}) \times h_{c} (α; ϕ_{up} + 2 \uppi α^{'} \cdot Id, ϕ_{down}) .

\frac{\partial R}{\partial ϕ} (ϕ) = - \frac{1}{σ _{\nabla ϕ}^{2}} \Deltaup ϕ for R (ϕ) = \frac{1}{2 σ _{\nabla ϕ}^{2}} x, y \sum ∥ \nabla ϕ ∥^{2} (x, y) .

\frac{\partial R}{\partial ϕ} (ϕ) = - \frac{1}{σ _{\nabla ϕ}^{2}} \Deltaup ϕ for R (ϕ) = \frac{1}{2 σ _{\nabla ϕ}^{2}} x, y \sum ∥ \nabla ϕ ∥^{2} (x, y) .

D = k, x, y \sum \frac{∥ i ( k , x , y ) - m ( ϕ _{up} , ϕ _{down} , k , x , y ) ∥ ^{2}}{2 σ ^{2} ( k , x , y )} .

D = k, x, y \sum \frac{∥ i ( k , x , y ) - m ( ϕ _{up} , ϕ _{down} , k , x , y ) ∥ ^{2}}{2 σ ^{2} ( k , x , y )} .

D = k, x, y \sum \frac{∥ i ( k , x , y ) - f _{k} \times [ h _{det} ⋆ h _{lec} ( ϕ _{up} + ϕ _{k} , ϕ _{down} ) ] ( x , y ) - b _{k} ∥ ^{2}}{2 σ ^{2} ( k , x , y )},

D = k, x, y \sum \frac{∥ i ( k , x , y ) - f _{k} \times [ h _{det} ⋆ h _{lec} ( ϕ _{up} + ϕ _{k} , ϕ _{down} ) ] ( x , y ) - b _{k} ∥ ^{2}}{2 σ ^{2} ( k , x , y )},

h_{lec}

h_{lec}

α_{x}^{'} = - N /2 \sum N /2 α_{y}^{'} = - N /2 \sum N /2 h_{a} (α^{'}; D_{ϕ}) \times h_{c} (α; ϕ_{up} + 2 \uppi α^{'} \cdot Id, ϕ_{down}) .

\frac{\partial D}{\partial ϕ} = l \sum m \sum \frac{\partial D}{\partial h _{lec} ( l , m )} \frac{\partial h _{lec} ( l , m )}{\partial ϕ} .

\frac{\partial D}{\partial ϕ} = l \sum m \sum \frac{\partial D}{\partial h _{lec} ( l , m )} \frac{\partial h _{lec} ( l , m )}{\partial ϕ} .

\frac{\partial D}{\partial ϕ}

\frac{\partial D}{\partial ϕ}

= l \sum m \sum \frac{\partial D}{\partial h _{lec} ( l , m )}

\times \frac{\partial}{\partial ϕ} α_{x}^{'} = - N /2 \sum N /2 α_{y}^{'} = - N /2 \sum N /2 h_{a} (α^{'}; D_{ϕ}) \times h_{c} (l, m; ϕ_{up} + 2 \uppi α^{'} \cdot Id, ϕ_{down}) .

\frac{\partial D}{\partial ϕ}

\frac{\partial D}{\partial ϕ}

\times l \sum m \sum \frac{\partial D}{\partial h _{lec} ( l , m )} \times \frac{\partial}{\partial ϕ} h_{c} (l, m; ϕ_{up} + 2 \uppi α^{'} \cdot Id, ϕ_{down}) .

s f = \frac{1000}{107 \pm 1} = 9.43 \pm 0.1.

s f = \frac{1000}{107 \pm 1} = 9.43 \pm 0.1.

r_{L} = \frac{99.5 \pm 0.5}{107 \pm 1} = 0.93 \pm 0.01.

r_{L} = \frac{99.5 \pm 0.5}{107 \pm 1} = 0.93 \pm 0.01.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Coronagraphic phase diversity through residual turbulence: performance study and experimental validation

Olivier Herscovici-Schiller,1 Jean-François Sauvage,2,3 Laurent M. Mugnier,2 Kjetil Dohlen3 and Arthur Vigan3

1 DTIS, ONERA, Université Paris Saclay, F-91123 Palaiseau – France

2 DOTA, ONERA, Université Paris Saclay, F-92322 Châtillon – France

3 Laboratoire d’Astrophysique de Marseille UMR 7326, Aix-Marseille Université, CNRS, 13388 Marseille, France E-mail: [email protected] (OHS)

(Accepted 2019 July 15. Received 2019 July 15; in original form 2019 January 17)

Abstract

Quasi-static aberrations in coronagraphic systems are the ultimate limitation to the capabilities of exoplanet imagers both ground-based and space-based. These aberrations – which can be due to various causes such as optics alignment or moving optical parts during the observing sequence – create light residuals called speckles in the focal plane that might be mistaken for a planets. For ground-based instruments, the presence of residual turbulent wavefront errors due to partial adaptive optics correction causes an additional difficulty to the challenge of measuring aberrations in the presence of a coronagraph. In this paper, we present an extension of COFFEE, the coronagraphic phase diversity, to the estimation of quasi-static aberrations in the presence of adaptive optics-corrected residual turbulence. We perform realistic numerical simulations to assess the performance that can be expected on an instrument of the current generation. We perform the first experimental validation in the laboratory which demonstrates that quasi-static aberrations can be corrected during the observations by means of coronagraphic phase diversity.

keywords:

instrumentation: high angular resolution –- instrumentation: adaptive optics –- techniques: high angular resolution –- techniques: image processing – turbulence – methods: data analysis

††pubyear: 2019††pagerange: Coronagraphic phase diversity through residual turbulence: performance study and experimental validation–Coronagraphic phase diversity through residual turbulence: performance study and experimental validation

1 Introduction

Imaging exoplanets is a challenging task. Current ground-based exoplanet imagers such as SPHERE or GPI reach contrast levels of $10^{-6}$ in H-band at 0.5 arc second separations (Beuzit et al., 2008; Macintosh et al., 2008). Such high contrast imaging is enabled by the use of coronagraphs, devices which reject the on-axis starlight but let out-of-axis light from disks or planets pass through. However, any optical aberration in the instrument causes light to leak through the coronagraph, which results in speckles appearing in the focal plane of the telescope. These speckles constitute the ultimate contrast limit for exoplanet imagers.

Since the quasi-static aberrations evolve slowly during the night (Martinez et al., 2012, 2013), the full contrast capacity of exoplanet imagers can be reached only if a correction is applied during the night. A few methods have been developed to measure the quasi-static aberrations (N’Diaye et al., 2013; Galicher et al., 2008). However, none of them is currently adapted to being used during the observations of a ground-based instrument, although some tests are currently taking place (Vigan et al., 2018). Our goal is to present the adequacy of the coronagraphic phase diversity, COFFEE, to the measurement of quasi-static phase aberrations in the presence of adaptive-optics-corrected residual turbulence. We start with the adaptation of the algorithm. We continue with a performance assessment consisting in numerical studies of the sensitivity of the method to various sources of perturbation. We finish with the description of a laboratory validation of the coronagraphic phase diversity in the presence of residual turbulence on the MITHiC bench at laboratoire d’astrophysique de Marseille.

2 Formalism of the coronagraphic phase diversity in the presence of residual adaptive-optics-corrected turbulence

In this section, we recall the formalism of coronagraphic phase diversity and we present its adaptation to the presence of adaptive-optics-corrected turbulence.

2.1 COFFEE, the coronagraphic phase diversity

COFFEE, the coronagraphic phase diversity, is a post-coronagraphic wavefront sensor (Sauvage et al., 2012; Paul et al., 2013a). As its name suggests, it is a flavour of the phase diversity (Gonsalves, 1982; Mugnier et al., 2006) method. The principle is to use information encoded in images produced by the scientific instrument to retrieve the aberrations of the instrument. Unfortunately, more than only one image is needed to do so. In phase diversity, two (or more) images are used. One is called the focused image. The other one, called the diversity image, is taken while a known aberration, called the diversity phase, is voluntarily introduced in the system. In the context of coronagraphic phase diversity for ground-based instruments, the diversity phase can be easily introduced by the deformable mirror of the adaptive optics system.

From a mathematical point of view, COFFEE is a maximum a posteriori estimator with non-homogeneous Gaussian noise assumption. The principle is to find the upstream phase aberration $\widehat{\phi_{\mathrm{up}}}$ and the downstream phase aberration $\widehat{\phi_{\mathrm{down}}}$ that minimise the cost function $\mathcal{J}$ :

[TABLE]

In the first term of the right hand sign, $i$ is an experimental image produced by the detector. Its counterpart $m$ is a numerical model of the image, which takes into account the sought quasi-static aberrations $\phi_{\mathrm{up}}$ and $\phi_{\mathrm{down}}$ . The indexes of the sum are $x$ and $y$ , which are the coordinates of the pixels in images, and $k$ , which is there to distinguish between the focal image and the diversity image. The denominator $\sigma^{2}$ is the variance of the noise level of pixel $x,y$ in the image $i(k)$ . This first term is a noise-weighted distance between the experimental data and the output of a model. The second term of the right hand sign, $\mathcal{R}$ , is a regularisation term, often taken as

[TABLE]

where $\nabla\phi$ is the spatial gradient of $\phi$ . This second term represents prior knowledge of the statistics of the aberrations. More precisely, this terms penalises the phase spatial gradients, which smooths the reconstructed phase and attenuates the noise propagation in the wavefront. This avoids unrealistic very high spatial frequencies to appear in the reconstructed wavefront.

If the images are not narrow-band, the impact of spectral width is negligible for $\Delta~{}\lambda/\lambda<15\%$ (Meynadier et al., 1999). If the image is broad-band, then the numerical model of the image must be calculated for several wavelengths (Seldin et al., 2000). This will increase the computation cost of the numerical model in proportion to the width of the spectral band. However, the calculations at different wavelengths can absolutely be done in parallel, so the increase in the number of computations will not result in an increase in the duration of the computation if a multiple-core-computer can be used.

2.2 Implementation: taking turbulence into account

The numerical implementation of COFFEE in a turbulence-free case is described in Paul et al. (2013a) and Paul et al. (2013b). The expression of $m$ in COFFEE is

[TABLE]

where $f_{k}$ the incoming flux, $h_{\mathrm{det}}$ is the response of the detector, $\star$ is the convolution operator, $h_{\mathrm{c}}$ is the point spread function of the coronagraphic instrument, $\phi$ is the static aberration that we seek to retrieve, $\phi_{k}$ is zero for the focused image and is the diversity phase in the diversity image, and $b_{k}$ is a constant background.

In order to take the effect of atmospheric turbulence into account, the model of data formation $m$ must reckon the impact of turbulence. In order to do so, we developed an analytic expression for coronagraphic imaging through turbulence (Herscovici-Schiller et al., 2017) to use as the point spread function of the instrument. The optical impulse response of the coronagraphic instrument in the presence of (residual) turbulence is noted $h_{\rm lec}$ – the index stands for “long exposure coronagraphic”. We still note $h_{\mathrm{c}}$ the impulse response of the coronagraphic instrument without turbulence. By analogy with the atmospheric transfer function (Roddier, 1981) as described by Herscovici-Schiller et al. (2017), we note $h_{\mathrm{a}}(D_{\phi})$ the atmospheric point spread function, defined as

[TABLE]

where $D_{\phi}$ is the phase structure function of the (residual) atmospheric turbulence. If the turbulence is supposed to be stationary and ergodic, then, for an exposure time much larger than the characteristic time of turbulence,

[TABLE]

where $\alpha$ is a two-dimensional angular position in the focal plane, $\phi_{\mathrm{up}}$ is the static aberration upstream of the coronagraph, $\phi_{\mathrm{down}}$ is the static aberration downstream of the coronagraph, and Id is the identity function of $\mathbb{R}^{2}$ . In order to use this forward model into COFFEE, one needs to compute it efficiently, and to compute its gradients.

The numeric model of the instrument is computed by performing a discrete sum to approximate the integral, using the same Fourier optics model for $h_{\mathrm{c}}$ as in Paul et al. (2013a), as long as an estimate of $D_{\phi}$ is available, for example using one of the methods described in Sauvage et al. (2012) or Véran et al. (1997). If the point spread function is sampled on $N\times N$ pixels, a natural choice is to calculate the numeric point spread function as

[TABLE]

2.3 A physical approximation that reduces computing costs

The resulting cost of calculation is then $N^{2}$ times the cost of calculating $h_{\mathrm{c}}$ . This cost can be considerably alleviated if the coronagraph is a Lyot-type coronagraph such as Lyot’s opaque mask or Roddier’s phase mask coronagraphs. Indeed, for those coronagraphs consisting of small phase or amplitude features strictly located in the vicinity of the stellar image, the influence of the focal mask on the point spread function is essentially negligible when there is a strong upstream tip-tilt. Let us define a central square region of side $M$ in the focal plane of the coronagraphic mask. If the light beam is centred outside of this central square, the coronagraph is supposed to have no influence on the beam. Then, the previous equation can be split into two regions:

[TABLE]

Since the double sum on the first line exists only for strong tip-tilts, the value of $h_{\mathrm{c}}$ in it is very close to the value of a non-coronagraphic point spread function, denoted $h$ :

[TABLE]

Now, we can give a convolutive structure to the first double sum. Let us define $h_{\mathrm{a}}^{0}$ as

[TABLE]

Then, Equation (8) can be re-written as :

[TABLE]

which amounts to a convolution in the first double sum :

[TABLE]

Since the cost of calculating the convolution product is negligible in comparison with the cost of calculating the double sum, the total cost of the calculation has diminished from $N^{2}$ times the cost of calculating $h_{\mathrm{c}}$ to $M^{2}$ times the cost of calculating $h_{\mathrm{c}}$ . An important point is then to determine which is the quality of the approximation as a function of the side of the exact calculation zone, $M$ . Figure 1 presents the energy in the difference between the field calculated without approximation and the field calculated using Equation (13). Various sizes $M$ , varying from 1 to 41, are considered. The coronagraph that is considered here is a Lyot coronagraph of radius $2\lambda/D$ . The root mean square of the upstream aberration is 100 nm at a wave-length $\lambda=1,589$ nm, and the phase structure function is representative of SPHERE’s SAXO adaptive optics system. The normalisation factor is taken such that the coronagraphic point spread function without any aberration, without turbulence, has a total energy of 1. The sampling factor is chosen to satisfy exactly the Shannon–Nyquist condition, that is to say that a numeric field of $M$ pixels corresponds to an optic angular size of $M\lambda/2D$ .

There are two regimes of loss of precision due to the approximation. In the first regime, the approximation is crude because the size $M$ is insufficient, and the error decreases greatly with any increase in $M$ . In the second regime, where $M>10$ , the exact calculation is performed on a zone of extension $5\lambda/D$ . Beyond this zone, the error committed by approximating a tilted coronagraphic point spread function by a non-coronagraphic one becomes negligible. Indeed, even if there is a Lyot (or Roddier & Roddier) coronagraph, a light beam tilted by more than $5\lambda/2D$ will essentially not be modified by the coronagraph. Consequently, in the second regime, the approximation error is very low, and decreases more slowly.

In conclusion, for coronagraphs such as the Lyot coronagraph or the Roddier & Roddier coronagraph, and in particular the popular APLC, the cost of a long-exposure coronagraphic image simulation can be about a hundredfold the cost of a short-exposure coronagraphic image. In practice, the simulation of a $1024\times 1024$ -pixel long-exposure coronagraphic image results in a calculation time of less than five seconds on a single core of an office computer equipped with a 2.6 GHz Intel processor. In principle, the approximation could be adapted to a four-quadrant phase mask coronagraph, where the convolutive approximation could be done far from the transitions.

2.4 Calculating the gradients

The COFFEE criterion given by Equation (1) can be separated into two parts. The gradient of the regularisation part, $\mathcal{R}$ , is easier to calculate, for the inclusion of turbulence into the model does not change it. As can be found in Paul (2014)

[TABLE]

Let us define $\mathcal{D}$ as the other part of the criterion, that is, the distance between the model $m$ and the data $i$ :

[TABLE]

Then, using the complete expression of $m$ , $\mathcal{D}$ is

[TABLE]

where

[TABLE]

The gradient of $\mathcal{D}$ with respect to $\phi$ can be calculated using the long-exposure coronagraphic point spread function as an intermediate variable :

[TABLE]

By using the analytic expression for $h_{\mathrm{lec}}$ , this gradient is also

[TABLE]

Now, a rearrangement of the summation operators leads to

[TABLE]

One can note that the second line of this last expression is the gradient of the non-regularised criterion in the absence of turbulence, whose expression is given in the appendix of Paul et al. (2013a). Consequently, the structure of the calculation of this gradient is similar to the calculation of the point spread function itself. Thus, its calculation can also benefit from the acceleration described in the previous subsection.

Now that the formalism and implementation of coronagraphic phase diversity in the presence of turbulence are described, let us move on to a numerical study of the robustness of the method.

3 Numerical study of the robustness of the method

In this section, we perform numerical simulations to study the impact on the quality of the COFFEE reconstruction of various discrepancies between the model $m$ and the actual imaging process leading to $i$ . A simulation result when there is no discrepancy is presented in Herscovici-Schiller et al. (2017). We consider successive discrepancies in order to obtain a realistic simulation of the performance that can be expected of the technique on a real instrument.

3.1 Parameters of the simulations

In all this section, the phase aberrations are expressed at a wavelength $\lambda=1.589$ nm. The simulation is purely monochromatic, the wavelength being $\lambda=1.589$ nm.. The input parameters of the simulations include a phase structure function $D_{\phi}$ that is representative of turbulence at Paranal after correction by SPHERE’s adaptive optics SAXO. The upstream phase aberration, $\phi_{\mathrm{up}}$ , has a root mean square of 50 nm, with an energy spectral density that follows a $f^{-2}$ statistics. This upstream phase aberration is what we wish to estimate thanks to COFFEE in the simulations. The downstream phase aberration, $\phi_{\mathrm{down}}$ , has a root mean square of 20 nm, with the same energy spectral density as the upstream aberration. The coronagraph is an unapodized Lyot coronagraph, with a ratio of 95 % between the diameter of the entrance pupil and the diameter of the Lyot stop. There are no amplitude aberrations in the system. The total flux of the source is taken at $10^{9}$ photons, and the root mean square of the electronic noise of the detector is one electron per pixel. The diversity phase is supposed to be perfectly known. It is taken as a pure defocus, with root mean square 125 nm. The images in the focal plane have $128\times 128$ pixels. The value of the sampling is chosen as 2, so the Shannon–Nyquist condition is respected, and the estimated phases are sampled over $64\times 64$ pixels.

3.2 Sensitivity to an error on the phase structure function

The analytic model that we use in the reconstruction algorithm requires the knowledge of the phase structure function of the post-adaptive optics residuals.

During operations, the statistics of the adaptive optics-corrected turbulence can be estimated using several techniques (Véran et al., 1997; Sauvage et al., 2012). A first family is telemetry techniques that give access to such data as the seeing or the wind speed. Those data can then be used jointly with the parameters of the adaptive optics system in a numeric simulation whose output is the phase structure function $D_{\phi}$ . A second family of techniques is to use such real-time measurements of the adaptive optics system as residual slopes and command voltages. Since this section is concerned with performing simulations, we used the first option to produce the phase structure function.

Whether one or the other technique is chosen, the phase structure function will not be perfectly known in practice. In this subsection, we test how an error on the parameter $D_{\phi}$ impacts the quality of the reconstruction. We generate data using the parameters detailed in the previous subsections. We then perform COFFEE reconstructions using different phase structure functions, which are obtained by multiplying the “true” phase structure function $D_{\phi}$ by a factor $1-p$ . For the sake of legibility, a reconstruction performed using $(1-p)\times D_{\phi}$ as phase structure function is called a reconstruction with a error of magnitude $p$ on $D_{\phi}$ . This corresponds to a reconstruction where the seeing is underestimated, but with a correct knowledge of the wind velocity and the magnitude of the source. Indeed, $D_{\phi}$ translates the phase residual statistics after the AO system, and this residual behaves in direct relation with the Fried parameter $r_{0}$ (Rigaut et al., 1998; Fétick et al., 2018), as the adaptive optics loop acts mainly as a filter. A more detailed analysis should also indicate the impact of a wind difference (hence only impacting the on-axis residuals), or a boiling effect, or mis-registration evolution, or the impact of additionnal dead actuators; here we perform a principle demonstration taking only $r_{0}$ , which is the main contributor to the final correction error of the adaptive optics loop, hence producing a scaling on $D_{\phi}$ .

Figure 2 displays the root mean square error between the estimated upstream phase, $\widehat{\phi}_{\mathrm{up}}$ , and the true upstream phase, $\phi_{\mathrm{up}}$ as a function on the error on $D_{\phi}$ , for five different values of the error $p$ . The error evolves in the same way for $p<0$ .. On SAXO, the adaptive optics of SPHERE, one can expect an error ranging from $5\%$ to $15\%$ . Consequently, one can expect an error of one to three nanometres on the part of the aberrations that can be corrected by the deformable mirror. Thus, COFFEE is expected be a good candidate to measure and correct the quasi-static aberrations several times per night, using on-sky measurements.

3.3 Sensitivity to the noise level

The incoming photonic flux has a critical impact on the noise level in the images. In this subsection, we study the impact of the photonic flux on the estimation quality of $\phi_{\mathrm{up}}$ . We perform this study with a 10 % error level on $D_{\phi}$ , for we do not have proof that the various causes of estimation errors are independent. When the light flux increases, the noise decreases, so we expect the quality of the estimation to increase. This expectation is confirmed by the results shown on Fig. 3. It shows that the estimation error decreases with an increase in the flux. Moreover, it shows that the quality of the estimation is much less sensitive to an increase in the incoming light flux at around one million incoming photons in the pupil.

We can explain the order of magnitude of this critical point in a simple manner. The coronagraph blocks about $90\%$ of the incoming $10^{6}$ photons. About $10^{5}$ reach the $128\times 128=16,384$ pixels of the detector, which amounts to an average of about 6 photons per pixel. Since the electronic noise is one electron per pixel, we deduce that the effect of an increase in the total flux is reduced if the average flux per pixel is such that the signal to noise ratio is higher than 5, which is easily reachable in a reasonable amount of time for a H magnitude of nine to twelve.

Let us consider the case of an observation by the Very Large Telescope of a star of magnitude 15 in the visible. The photonic flux on the detector without a coronagraph is about $2\times 10^{5}$ photons per second in H-band. Since the data that we aim to use are typically exposures of a several hundred seconds, the resulting number of photons is in the range of $10^{7}$ , which is quite enough to avoid the estimation to be limited by the noise level.

3.4 Sensitivity to the presence of a planet in the data

The data formation model that we use in COFFEE relies on the light propagation from a point source. However, for practical on-sky implementations, COFFEE must be able to estimate aberrations while observing planets. Here, we present COFFEE estimates performed on simulated images where there is a planet. We chose realistic parameters: we kept a $10\%$ error level on $D_{\phi}$ , and an incoming light flux of $10^{6}$ photons in the pupil. The position of the planet is at an angle $3.5\lambda/D$ from the star. We simulated planets with flux ratios of $10^{-3}$ , $10^{-4}$ , and $10^{-5}$ between the star and the planet, so that we could test the various impacts that various planet fluxes could have on the estimation error.

Figure 4 presents COFFEE estimates with no planet in the data, and with planets whose fluxes are $10^{-5}$ , $10^{-4}$ , and $10^{-3}$ with respect to the star.

In the case where the flux ratio is $10^{-5}$ , the phase estimation is almost unperturbed by the planet: the root mean square of the difference between the reconstruction with a planet and the reconstruction without planet is only 0.35 nm. This difference is mainly due to the fact that the realisations of the noise in the data are different.

In the case where the flux ratio is $10^{-3}$ , the phase estimation is strongly perturbed by the planet. In that case, the root mean square of the difference between the reconstruction with a planet and the reconstruction without planet is 6.0 nm. The error is mainly a sinusoidal structure whose image through the data formation model generates a strong speckle that looks like a planet.

In the intermediate where the flux ratio is $10^{-4}$ , the phase estimation is slightly perturbed by the planet. In that case, the root mean square of the difference between the reconstruction with a planet and the reconstruction without planet is 0.5 nm.

These results show that COFFEE may or may not mistake a planet in the data for a speckle, depending on its light flux. This influence of the light flux of the planet can be explained by comparing it with the regularisation level. If the flux of the planet is high, the associated noise in the pixels that image it is low, and the planet then has an important impact on the criterion, leading to a sinusoidal estimation for $\phi_{\mathrm{up}}$ . On the opposite, if the flux of the planet is low, the associated noise in the pixels that image it is high, and the planed then has a low impact in the criterion, so the regularisation will prevent the emergence of an artefact. This is especially obvious if one compares Figs 5 and6.

We conclude that, for a planet whose light flux compared to its star is less than $10^{-4}$ , its presence in the data will account for an estimation error of root mean square less than half a nanometre in the SPHERE-like case of a fifty-nanometre upstream aberration. If, however, a planet of flux higher than $10^{-4}$ happened to be in the field, then it would be clearly visible in the raw data (compare Figs 5 and 6). In that case, one could choose to ignore a small region about the planet in the COFFEE input data. The alternative is to implement a model of the planet as a source point in the COFFEE algorithm. Even in the presence of an imperfect knowledge of the phase structure function, and with a limited exposure time, the expected error root mean square of the estimation error is of the order of a nanometre in the adaptive optics-corrected zone, for a 50-nanometres root mean square phase aberration. With this encouraging simulation result in mind, we proceed to the laboratory validation.

4 Laboratory validation of the coronagraphic phase diversity in the presence of residual turbulence

4.1 Strategy of validation

4.1.1 Aim

The aim of our experiment was to use the coronagraphic phase diversity to estimate a static phase aberration upstream of a coronagraph by using post-coronagraphic images as input data. The experiment was performed in a controlled environment, the MITHiC testbed.

4.1.2 The MITHiC testbed

MITHiC is the Marseille Imaging Testbed for High Contrast imaging. It has been developed at the laboratoire d’astrophysique de Marseille (LAM) for almost ten years N’Diaye et al. (2012). It is schematically described on Fig. 7.

The light source is a super-luminescent diode. It emits light in a narrow spectral band centred around $\lambda=677$ nm. In all our data processing, it is considered as a monochromatic light source. The light undergoes linear polarisation, and injected on the bench through an entrance pupil.

It is propagated in a second pupil plane, where it passes through a rotating transparent phase screen. On this screen are engraved random path differences whose statistics is that of atmospheric turbulence corrected by SAXO, the adaptive optics system of the SPHERE instrument. The scale of these path differences is such that the 45-nm-RMS phase shift that they create at the working wavelength of 677 nm is the same as the 100-nm-RMS phase shift created by the SAXO-corrected atmospheric turbulence at 1,600 nm, which is a typical observational wavelength for SPHERE. The phase screen used on MITHIC to produce the wavefront errors (both the residual turbulence, or dedicated static patterns) have been specified by LAM and realized by SILIOS company on a pixel map interface. LAM has provided exactly the phase map (pixel by pixel depth graduated in nanometers) to be engraved on the phase screen. The realization of SILIOS has been checked at LAM with a high-resolution ZYGO interferometer after delivery, and they are correct at a nanometric level, which guarantees that the statistic, as well as the power law and RMS across the aperture are the expected ones.

In a third pupil plane, the light meets the surface of a spatial light modulator. The presence of this element is the reason why the light is polarised in the first place. The spatial light modulator is used as a high resolution deformable mirror.

The light is then split into two paths using a beam splitter. The auxiliary path, or ZELDA path (N’Diaye et al., 2013), can be used to perform ZELDA experiments, or to use a HASO wavefront sensor as a calibration reference, which is what we did. The main path, which is the one of interest for us here, comprises a Roddier & Roddier focal plane mask, and a Lyot stop in the next pupil plane.

Finally, the light reaches a focal plane camera. This focal plane can be turned into a pupil plane by introducing a movable lens in the light beam.

4.1.3 Validation strategy

Our goal was to validate aberration estimation using coronagraphic phase diversity in the presence of residual adaptive optics-corrected turbulence. The plainest strategy imaginable would have been to introduce a known pupil-plane phase aberration upstream of the coronagraph, take focused and diversity images, process them with COFFEE, and compare the output of COFFEE to the known aberration. However, this plain strategy would have needed that the optical testbed be absolutely perfect, that is to say that the introduced aberration be perfectly known. Since aberrations always exist on the bench, and COFFEE is an absolute wavefront sensor, this was not feasible. So we proceeded in two times, using a differential estimation strategy, as for example in Herscovici-Schiller et al. (2018).

First, we did not introduce any aberration. Let us call $\phi_{\mathrm{up}}^{0}$ the static aberration on the bench. We took a focused image and a diversity image, which we collectively denote by $\mathbf{i^{0}}$ . We used $\mathbf{i^{0}}$ as an input in COFFEE, the output being our estimate of the static aberration on the bench, $\widehat{\phi_{\mathrm{up}}^{0}}$ .

Then, we used the spatial light modulator to introduce a known aberration, $\phi_{\mathrm{up}}^{F}$ . This did not suppress the aberration $\phi_{\mathrm{up}}^{0}$ on the bench, so the resulting total aberration on the bench was $\phi_{\mathrm{up}}^{0}+\phi_{\mathrm{up}}^{F}$ . We took a focused image and a diversity image, which we collectively denote by $\mathbf{i^{1}}$ . We used $\mathbf{i^{1}}$ as an input in COFFEE, the output being our estimate of the total aberration on the bench, $\widehat{\phi_{\mathrm{up}}^{0}+\phi_{\mathrm{up}}^{F}}$ .

The last step is to compute the difference $\widehat{\phi_{\mathrm{up}}^{0}+\phi_{\mathrm{up}}^{F}}-\widehat{\phi_{\mathrm{up}}^{0}}$ , which we use as the estimate $\widehat{\phi_{\mathrm{up}}^{F}}$ of $\phi_{\mathrm{up}}^{F}$ . This validation strategy is described symbolically on Fig. 8.

4.2 Calibration of the parameters of the optical model

Since COFFEE relies on a model of image formation, some calibrations are necessary to perform an estimation. We detail these calibrations here.

4.2.1 Sampling on the detector

To determine the sampling on the detector, we took a non-coronagraphic image whose size is $1,000\times 1,000$ pixels. The modulus of its Fourier transform is the modulation transfer function. This modulation transfer function, whose circular average is shown on Fig. 9, goes to zero at spatial frequency pixel number 107. Consequently, the sampling factor $sf$ on the detector is

[TABLE]

In order to shorten calculation times, the data used as input to COFFEE were under-sampled by a factor 4. The resulting sampling being higher than 2, the Shannon–Nyquist condition is satisfied, so there is no loss of information in the reconstruction.

4.2.2 Lyot ratio

The Lyot ratio is defined as the ratio between the diameter of the entrance pupil and the diameter of the Lyot stop. It is a necessary parameter for the direct model. In order to determine it, we introduced the Lyot stop — but not the Roddier & Roddier focal mask — on the bench. Then, just as for the determination of the sampling on the detector, we took an image and examined the cut of the corresponding modulation transfer function. It is located between pixels 99 and 100. Since the Lyot ratio is proportionnal to the ratio of the cut frequencies, the Lyot ratio is

[TABLE]

4.2.3 Detector noise

The only information on the electronic noise of the camera given by the manufacturer is that its root mean square is less than 5 electrons per pixels Photometrics (2014). A previous calibration had found a root mean square value of one electron per pixel. We calculated the root mean square of a stack of 4000 images taken in complete darkness, which yielded a root mean square of 1.6 electrons per pixel. We use this value $\sigma_{\text{det}}$ as the detector noise in the denominator in Equation 1. The exact formula for the denominator is $\sigma^{2}=\sigma^{2}_{\text{photon}}+\sigma^{2}_{\text{det}}$ , where $\sigma^{2}_{\text{photon}}$ is directly estimated from the images (Mugnier et al., 2004).

4.2.4 Phase structure function

A key parameter of the direct model in the presence of turbulence is the phase structure function of the post-adaptive optics turbulence. In order to estimate it in our case, we took the specification file of the rotating phase screen, and calculated a variance over as many distinct realisation as there were non-overlapping disks in the adaptive-optics corrected turbulence path difference strip on the phase screen. The absolute value of the corresponding atmospheric point spread function (defined by Eq. 4), $\left|\mathcal{F}\left[\exp\left(-D_{\phi}/2\right)\right]\right|$ , is displayed on Fig. 10. The correction limit of the adaptive optics at a radius of 20 $\lambda/D$ is clearly visible.

4.2.5 Uncertainty on the introduced diversity

The diversity phase was introduced by the spatial light modulator. Since any error on the diversity phase impacts the quality of the estimation, it is important to check the exactness of the introduced diversity. In order to do so, we first introduced a command for a 200 nm defocus on the spatial light modulator. We measured the defocus that was effectively introduced thanks to an Imagine Optic HASO wavefront sensor. The HASO measurement, displayed on Fig. 11, shows that the introduced phase diversity is really a pure defocus. When projected on the first thirty-two Zernike polynomials, the HASO measurement is of a 196 nm root mean square aberration, 195 nm of whose are concentrated in the defocus. This shows that the spatial light modulator produces phase is close to that which it is commanded to produce.

Another important matter is the linearity of the response of the spatial light modulator. We tested the amplitude of the defocus, as measured by the HASO, for various commands. Figure 12 shows the excellent linearity of the response, with a linear correlation coefficient of 0.994. Consequently, for a defocus diversity whose root mean square is 75 nanometres, the error on the diversity should have an impact of less than a nanometre on the estimation (Blanc et al., 2003).

4.3 Estimation of the reference wavefront

4.3.1 Data

We aligned the coronagraph on the bench, activated the turbulence simulator, and turned on the light source. The spatial light modulator is sent a flat command. The resulting phase aberration is denoted by $\phi_{\mathrm{up}}^{0}$ on Fig. 8. A stack of 400 images is saved. Each exposure lasts 0.035 s, with a wait of 0.020 s between exposures. After that, the light source is turned off, and a stack of 400 images is saved in the same condition. The first stack (the data stack) is then averaged; the median of each pixel of the second stack (the background stack) is calculated; and then the background is subtracted from the data sum average. The resulting data is then under-sampled by a factor 4 for the sake of calculation speed, and constitutes the data $\mathbf{i_{\mathrm{foc}}^{0}}$ that will be input into COFFEE.

Then the whole procedure is repeated, this time with the spatial light modulator being sent a 75 nm defocus. The resulting data, $\mathbf{i_{\mathrm{div}}^{0}}$ , joins $\mathbf{i_{\mathrm{foc}}^{0}}$ to constitute $\mathbf{i^{0}}$ , the complete COFFEE input for the reconstruction. This input is displayed on the bottom part of Fig. 13.

4.3.2 Reconstruction

We perform an estimation of $\phi_{\mathrm{up}}^{0}$ using $\mathbf{i^{0}}$ as input data, along with the model of the bench calibrated in the previous section. The estimate $\widehat{\phi^{0}_{\mathrm{up}}}$ of the reference wavefront $\phi^{0}_{\mathrm{up}}$ is shown on Fig. 14. Its root mean square is 24 nm.

4.4 Estimation of a high-frequency aberration

4.4.1 Data

The spatial light modulator is now sent a F-shape command. The corresponding pupil image (taken when no coronagraph was in place) is displayed on Fig. 15. The root mean square of this command is 11 nm.

The resulting phase aberration is denoted by $\phi_{\mathrm{up}}^{F}$ on Fig. 8. Just like before, a stack of 400 images is saved. Each exposure lasts 0.035 s, with a wait of 0.020 s between exposures. After that, the light source is turned off, and a stack of 400 images is saved in the same condition. The first stack (the data stack) is then averaged; the median of each pixel of the second stack (the background stack) is calculated; and then the background is subtracted from the data sum average. The resulting data is then under-sampled by a factor 4 for the sake of calculation speed, and constitutes the data $\mathbf{i_{\mathrm{foc}}^{1}}$ that will be input into COFFEE.

Then the whole procedure is repeated, this time with the spatial light modulator being sent a 75 nm defocus on top of the F-shape aberration. The resulting data, $\mathbf{i_{\mathrm{div}}^{1}}$ , joins $\mathbf{i_{\mathrm{foc}}^{1}}$ to constitute $\mathbf{i^{1}}$ , the complete COFFEE input for the reconstruction. This input is displayed on the top part of Fig. 13.

4.4.2 Reconstruction

We perform an estimation of $\phi_{\mathrm{up}}^{0}+\phi_{\mathrm{up}}^{F}$ using $\mathbf{i^{1}}$ as input data, along with the model of the bench calibrated in the previous section.The estimate $\widehat{\phi^{0}_{\mathrm{up}}+\phi_{\mathrm{up}}^{F}}$ of the high-frequency wavefront $\phi^{0}_{\mathrm{up}}+\phi_{\mathrm{up}}^{F}$ is shown on Fig. 16. Its root mean square is 28 nm.

4.4.3 Differential reconstruction

The difference between $\widehat{\phi_{\mathrm{up}}^{0}+\phi_{\mathrm{up}}^{F}}$ and $\phi_{\mathrm{up}}^{0}$ is our estimate $\widehat{\phi_{\mathrm{up}}^{F}}$ of the introduced F-shape aberration, $\phi_{\mathrm{up}}^{F}$ . It is displayed on Fig. 17. The estimate has a root mean square of 13 nanometres, whereas the introduced aberration has a root mean square of 11 nanometres. We conclude that we are able to reconstruct a high-frequency aberration of about 10 nanometres with a 2 nanometres accuracy, using the scientific camera of a coronagraphic system in the presence of turbulence.

The total computing time necessary for the reconstruction of the aberration is about two hours and a half on a single core of an office computer equipped with a 2.6 GHz Intel processor, running a COFFEE program written in Interactive Data Language, without much optimisation. Therefore, using a modern computer running an optimised code, quasi-static aberrations compensation could be performed on an instrument such as SPHERE at least once per hour.

5 Conclusions

We have presented coronagraphic phase diversity through turbulence as a post-coronagraphic wavefront sensor adapted to high-precision wavefront measurement in the presence of adaptive optics-corrected turbulence. We have performed realistic simulations which show that the estimation error when used on a high-contrast system such as SPHERE should not exceed a few nanometres, whereas the quasi-static aberration of SPHERE is about 50 nanometres. Finally, we have performed a laboratory demonstration of the validity of the technique, reconstructing a high-frequency aberration with a precision of about two nanometres through a coronagraph and turbulence. Our team’s next step in to perform a measurement and correction of quasi-static aberrations on SPHERE, thanks to data collection by A. Vigan and M. N’Diaye. Since we demonstrated coronagraphic phase diversity in the presence of residual adaptive-optics-corrected turbulence, since there are no intrinsic chromatic limitations with COFFEE, and since the execution time of the program allows it to be executed several time during the night, we hope to soon prove that it is efficient on-sky.

Acknowledgements

The PhD work of O. Herscovici-Schiller was co-funded by CNES and ONERA. We thank J.-M. Le Duigou for his support. This work received funding from the E.U. under FP7 Grant Agreement No. 312430 OPTICON, from the CNRS (Défi Imag’In), and from ONERA in the framework of the VASCO research project. We thank Raphaël Galicher for helpful criticism of a first draft of this paper. We thank the reviewer for their careful review and constructive comments, which helped us to improve this article.

Bibliography26

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Beuzit et al. (2008) Beuzit J.-L., et al., 2008, in Ground-based and airborne instrumentation for astronomy II. p. 701418
2Blanc et al. (2003) Blanc A., Fusco T., Hartung M., Mugnier L., Rousset G., 2003, Astronomy & Astrophysics, 399, 373
3Fétick et al. (2018) Fétick R. J. L., Neichel B., Mugnier L. M., Montmerle-Bonnefois A., Fusco T., 2018, Monthly Notices of the Royal Astronomical Society , 481, 5210 · doi ↗
4Galicher et al. (2008) Galicher R., Baudoz P., Rousset G., 2008, Astronomy & Astrophysics , 488, L 9 · doi ↗
5Gonsalves (1982) Gonsalves R. A., 1982, Optical Engineering , 21, 215829 · doi ↗
6Herscovici-Schiller et al. (2017) Herscovici-Schiller O., Mugnier L. M., Sauvage J.-F., 2017, Monthly Notices of the Royal Astronomical Society: Letters , 467, L 105 · doi ↗
7Herscovici-Schiller et al. (2018) Herscovici-Schiller O., Mugnier L. M., Baudoz P., Galicher R., Sauvage J.-F., Paul B., 2018, Astronomy & Astrophysics, 614, A 142
8Leboulleux (2018) Leboulleux L., 2018, Thèse de doctorat, Université d’Aix-Marseille