Free log-likelihood as an unbiased metric for coherent diffraction   imaging

Vincent Favre-Nicolin; Steven Leake; Yuriy Chushkin

arXiv:1904.07056·cond-mat.mtrl-sci·August 27, 2020

Free log-likelihood as an unbiased metric for coherent diffraction imaging

Vincent Favre-Nicolin, Steven Leake, Yuriy Chushkin

PDF

TL;DR

This paper introduces a 'free' log-likelihood metric for unbiasedly evaluating coherent diffraction imaging reconstructions, enabling validation across diverse samples and large datasets without prior support knowledge.

Contribution

The authors propose a novel 'free' log-likelihood indicator and eigen-decomposition analysis for unbiased solution validation in CDI, applicable to various materials and experimental conditions.

Findings

01

The method provides an unbiased validation metric for CDI reconstructions.

02

It is effective on experimental data from both test patterns and biological samples.

03

Applicable to high-throughput datasets from advanced synchrotron and XFEL sources.

Abstract

Coherent Diffraction Imaging (CDI), a technique where an object is reconstructed from a single (2D or 3D) diffraction pattern, recovers the lost diffraction phases without a priori knowledge of the extent (support) of the object, which prevents an unambiguous metric evaluation of solutions. We propose to use a 'free' log-likelihood indicator, where a small percentage of points are masked from the reconstruction algorithms, as an unbiased metric to evaluate the validity of proposed solutions, independent of the sample studied. We also show how a set of solutions can be analysed through an eigen-decomposition to yield a better estimate of the real object. Example analysis on experimental data is presented both for a test pattern dataset, and the diffraction pattern from a live cyanobacteria cell. The method allows the validation of reconstructions on a wide range of materials (hard…

Tables1

Table 1. Table 1: Figures of merit for CDI analysis: E o 2 superscript subscript 𝐸 𝑜 2 E_{o}^{2} is the object-domain error [ 4 , 21 ] , where Ω Ω \Omega denotes the object support (i.e. the area or volume where the object lies), and ρ i subscript 𝜌 𝑖 \rho_{i} is the density inside the object. E F 2 superscript subscript 𝐸 𝐹 2 E_{F}^{2} is the Fourier-space domain error [ 4 , 24 , 21 ] comparing calculated and observed amplitudes (this formula can also be used based on intensities). L L K 𝐿 𝐿 𝐾 LLK is the Poisson log-likelihood [ 23 ] , where I i o b s subscript superscript 𝐼 𝑜 𝑏 𝑠 𝑖 I^{obs}_{i} and I i c a l c subscript superscript 𝐼 𝑐 𝑎 𝑙 𝑐 𝑖 I^{calc}_{i} are, respectively, the observed and calculated intensity at pixel i 𝑖 i of the diffraction data. L L K f r e e 𝐿 𝐿 subscript 𝐾 𝑓 𝑟 𝑒 𝑒 LLK_{free} is the Poisson log-likelihood computed only over a ’free’ set of pixels. The figures correspond either to the three solutions shown in Fig. 1 , or to the eigen- and average solutions shown in Fig. 3 . Only the L L K f r e e 𝐿 𝐿 subscript 𝐾 𝑓 𝑟 𝑒 𝑒 LLK_{free} allows to correctly discriminate between individual reconstructions.

Figure of merit	tight (1b)	+2 pixels (1c)	large (1d)	eigen-10 (3a)	average-10 (3b)	eigen-4
$n b_{s u p p o r t}$	4353	10460	17616	-	-	-
$E_{o}^{2} = \sum_{i \notin Ω} {\| ρ_{i} \|}^{2} / \sum_{i} {\| ρ_{i} \|}^{2}$	4.9e-3	3.3e-3	4.5e-3	1.4e-2	3e-2	6e-3
$E_{F}^{2} = \sum {\| F_{i}^{c a l c} - F_{i}^{o b s} \|}^{2} / \sum {\| F_{i}^{o b s} \|}^{2}$	0.8559	0.8557	0.8558	0.8590	0.8610	0.8570
$L L K = - \frac{1}{N} \sum_{i} \log \frac{{(I_{i}^{c a l c})}^{I_{i}^{o b s}}}{I_{i}^{o b s}!} e^{- I_{i}^{c a l c}}$	68	46	45	100	148	82
$L L K_{f r e e}$	83	122	323	98	149	80

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Free log-likelihood as an unbiased metric for coherent diffraction imaging

Vincent Favre-Nicolin

ESRF, The European Synchrotron, 71 Avenue des Martyrs, 38000 Grenoble, France

Univ. Grenoble Alpes, Grenoble, France

[email protected]

Steven Leake

ESRF, The European Synchrotron, 71 Avenue des Martyrs, 38000 Grenoble, France

Yuriy Chushkin

ESRF, The European Synchrotron, 71 Avenue des Martyrs, 38000 Grenoble, France

Abstract

Coherent Diffraction Imaging (CDI), a technique where an object is reconstructed from a single (2D or 3D) diffraction pattern, recovers the lost diffraction phases without a priori knowledge of the extent (support) of the object. The uncertainty of the object support can lead to over-fitting and prevents an unambiguous metric evaluation of solutions. We propose to use a ’free’ log-likelihood indicator, where a small percentage of points are masked from the reconstruction algorithms, as an unbiased metric to evaluate the validity of computed solutions, independent of the sample studied. We also show how a set of solutions can be analysed through an eigen-decomposition to yield a better estimate of the real object. Example analysis on experimental data is presented both for a test pattern dataset, and the diffraction pattern from a live cyanobacteria cell. The method allows the validation of reconstructions on a wide range of materials (hard condensed or biological), and should be particularly relevant for 4th generation synchrotrons and X-ray free electron lasers, where large, high-throughput datasets require a method for unsupervised data evaluation.

Introduction

Coherent Diffraction Imaging (CDI) is a technique that exploits the coherence properties of a light source, for instance, synchrotron generated X-ray beams, to reconstruct two- or three-dimensional objects from their diffraction pattern alone [1, 2, 3, 4, 5]. Such an approach can also be used with soft X-ray sources [6] and coherent electron beams [7, 8] and when employed in the Bragg geometry yields quantitative strain information [9, 10, 11, 12, 13]. CDI has been successfully used on a wide range of samples, from single cells [14] to inorganic particles [15, 16], with applications exploiting the temporal properties of X-ray Free Electron Lasers to viruses [17] and time-resolved strain analysis [18].

As CDI is based on the measurement of the far-field diffraction pattern of a single object, the reconstruction is only possible if the diffraction pattern is recorded at a spacing finer than the Nyquist frequency (this condition is called oversampling)[19, 1, 3]. This is easily done experimentally if the sample size can be estimated, and a variety of algorithms (Error Reduction (ER), Hybrid Input-Output (HIO), Relaxed Averaged Alternating Reflectors (RAAR), Charge Flipping (CF) etc.) can be used to phase the diffraction pattern and reconstruct the object [5, 20, 21].

However the weakness of CDI lies in the absence of reliable figures of merit to assess the quality of the reconstructed objects. In principle, it is easy to define a figure of merit by comparing the observed diffraction pattern to the calculated one. But as the diffraction pattern is oversampled and the actual size and shape of the object is unknown, it is easy to create incorrect solutions which involve an object size larger than the real one (i.e. with many extra free parameters), and thus yield a better figure of merit by over-fitting.

In this article, we propose a free log-likelihood as an objective figure of merit that outperforms those that exist in the literature. Then we show how it can be applied to evaluate the solutions and combine them to obtain the final optimal reconstruction.

All the data and the python notebooks used to generate the figures in this article are available from [22].

Figures of merit

A number of figures of merit have been used in the literature, working either in the object or Fourier-space domain. A list of the most used figures of merit is shown in table 1. Note that the most quantitative approach to a reliable figure of merit was introduced for Ptychography [23], using a likelihood based analysis that considers Poisson noise.

In order to evaluate how discriminating these figures are, we used a 2D diffraction dataset, from an ESRF logo sample, recorded at the ID10 beamline [15] (see [25] for experimental details). A center of symmetry was applied to the diffraction pattern in order to reduce the number of unobserved pixels behind the beamstop (in the case of a homogeneous object with a constant thickness, where the projected electronic density is constant i.e., the object is real up to a constant phase factor, the diffracted intensity (Fourier Transform) is centro-symmetric). In Fig.1 the diffraction data and several reconstructed solutions are shown, which were obtained by starting from a random object with a fixed support, 400 cycles of Hybrid Input-Output followed by 200 cycles of Error Reduction. Note that the actual algorithm used here is irrelevant, as any global optimisation scheme will produce a distribution of solutions, and the aim of this article is to provide a way to assess their quality. We deliberately did not impose positivity to have a large enough range of solutions to evaluate.

The three solutions were generated using (b) a tight support (4353 pixels), and loose supports that radially expand the tight support by (c) two pixels, or (d) seven pixels.. As is clearly seen in Fig. 1, the best result is obtained with a tight support, and quickly degrades for larger supports.

As can be seen in table 1, none of the figures of merit are discriminating: while the solution in Fig. 1b) obtained from a tight support is obviously the best and corresponds to the scanned sample [25], both the object-domain error and the Fourier-based metric $E_{F}^{2}$ show little difference between the solutions, and are higher for the correct tight support solution. This is clearer for the $LLK$ , which is worse for the tight solution (b) even though it is the best result. The main reason for these conflicting results is that the more points there are in the support, the more free parameters the algorithm can use to better fit the diffraction data - Fig. 1d) shows a clear case of over-fitting.

From this example one concludes that it is essential to simultaneously achieve both: a good fit between the calculated and observed diffraction pattern, and a tight support around the object. This has long been known and efficient algorithms, such as the shrink-wrap approach [5], exist to produce a tight support which will normally yield a unique solution in two or three dimensions [26, 27, 28, 29, 30, 31].

Other constraints or figures of merit can be used based on physical properties, but they rely strongly on a priori knowledge of the object and thus are limited in application. For example, positivity of the reconstructed object (e.g. when operating in the small-angle regime for a thin object, or in the Bragg regime for an unstrained nanocrystal), or enforcing phase, density or other ad hoc constraints[32].

The lack of an unambiguous evaluation metric often leads to the need to visually inspect, evaluate and select solutions, which is very slow, and hence unpractical. As brighter X-ray sources pave the way for serial CDI experiments with larger data throughput [33, 34], the need, for a discriminating figure of merit to sort the computed solutions without requiring a visual inspection, is paramount.

free log-likelihood

The risk of misleading figures of merit due to over-fitting also exists e.g. in macro-molecular crystallography, where the large number of parameters (the atomic positions) must be refined against the available diffraction data: in order to avoid over-fitting, a free R-factor was introduced [35, 36, 37] and has since been used by the community. This consists of setting aside a small percentage of diffraction data (the ’free’ set) and refining the structure against the remaining ’working set’ of data. The $R_{free}$ is then evaluated only by comparing the calculated diffraction data against the ’free’ set, producing an un-biased figure of merit. This approach is more generally known as jack-knifing [38, 39], and was developed for unbiased statistical evaluation.

This approach is even more appealing for CDI because of the large uncertainty of the support area (contrary to refinement in crystallography where the atomic sequence or chemical formula is usually known). We have implemented this in the PyNX accelerated coherent imaging toolkit [40], which can be used for CDI analysis:

•

$\approx$ 5% of the observed diffraction data is set aside in a ’free’ set of pixels with pixels, which are grouped in islands of radius 3 pixels (volumes in 3D) - to make sure that correlations between neighbouring pixels does not create a strong relationship between the working set and the ’free’ set. This is needed because the diffraction data is oversampled [1, 41], and thus neighbouring pixels are not completely independent. Note that the proposed size of the islands corresponds to a typical oversampling ratio in CDI experiments, but could be tuned according for each experiment, even each direction. The effect of the island size can be seen in figures available as supplementary information.

•

The free pixels are randomly located in the dataset, only excluding the center of the diffraction pattern (5% of the maximum radius): this area can include a large number of photons, and masking those can hinder the initial estimate of the object.

•

when performing the usual projection algorithms [20], in the Fourier update step (replacing the calculated amplitudes with the observed ones while keeping the calculated phases), pixels in the ’free’ set are considered masked and keep their calculated complex value.

In the following we will only report the free Poisson log-likelihood $LLK_{free}$ figure of merit, because the photon counting properties of modern X-ray detectors theoretically make Poisson log-likelihood the natural choice. However, we would like to point out that the ensuing discussion and examples would be identical with any other noise model or Fourier based metric.

As is shown in Table. 1, the free log-likelihood correctly discriminates between the three solutions (tight support, +2 pixels, +7 pixels).

We conducted a more systematic study using the same dataset as for Fig.1 and performed 1000 optimisations using a similar approach: starting from a random object with an initial tight support expanded by a radius of 7 pixels, then performing 400 HIO cycles followed by 200 ER cycles, updating the support every 20 cycles, with a support threshold[5] randomly chosen between 0.25 to 0.4. Note that we did not impose positivity: that and the random threshold values leads to a wide range of solutions for statistical purposes. With a positivity constraint, most solutions would have been much closer to the optimal one. Both $LLK$ and $LLK_{free}$ are plotted against the final number of pixels in the support for all solutions in Fig. 2. In this graph the normal $LLK$ (measured against the ’working’ set) monotonically decreases as the number of points in the support increases, even if a kink is clearly visible in the curve. $LLK_{free}$ however, displays a clear minimum around 4300 pixels, which corresponds to an optimal solution similar to that in Fig. 1b.

While this demonstrates the capability of $LLK_{free}$ to serve as an unbiased figure of merit, we have identified two limitations. First, nothing prevents an incorrect model to have a low $LLK_{free}$ by chance - however as is shown in Fig. 2 with 1000 generated solutions, it is statistically improbable. Second, $LLK_{free}$ is not an absolute figure of merit, and a single value cannot indicate the validity of the solution, as it is dependent on the counting statistics (e.g. a dataset with many zero-valued points will generate both a low $LLK$ and a low $LLK_{free}$ ). Finding an optimal solution relies on (i) generating a number (typically at least 20) of solutions and then (ii) selecting the ones with the lowest $LLK_{free}$ , for the same set of free pixels. A more complete validation requires a statistical analysis of the best solutions, as will now be explained.

Eigen- and average solutions

Once a set of solutions has been produced, usually a selection of the best solutions are averaged and then compared against the diffraction data, e.g. by plotting the Phase Retrieval Transfer Function (PRTF)[14, 24, 15], which is the ratio of the average calculated amplitude to the observed amplitude, as a function of the resolution ring (a fraction of the sampling frequency of the dataset). The relative frequency at which the PRTF falls below 50% can be used as an indication of the correlation between the chosen solutions, and of the resolution of the reconstruction.

An alternative approach to averaging consists in computing eigenvectors for the selected solutions. This is done by arranging all solutions in a matrix (by flattening each solution as a one-dimensional row of the matrix), and then computing the singular value decomposition (SVD) of this matrix, which yields a set of orthonormal ’eigen-solutions’ [42]. One property of this decomposition is that it conserves the total squared amplitude of the solutions.

An example of application on the ESRF logo dataset is shown in Fig. 3. This method presents a few advantages compared to averaging: first, it is less sensitive to outlier solutions than averaging (a single outlier would contribute to secondary modes in the eigenvector decomposition), and second, it yields a weight (its overall squared amplitude) for each eigen-solution that can also be used as an indicator of the correlation between solutions, ideally the relative weight of the first (strongest) eigen-solution should be as close to 100% as possible.

Fig. 3 shows the result of combining either the 10 or 4 best (lowest $LLK_{free}$ ) solutions from 50 optimisations, obtained with the same procedure as those in Fig.2, and after sub-pixel alignment[43] and phase matching of the solutions. Note that the 10 best selected solutions had up to 16300 points in the support and a $LLK_{free}$ up to 255, while the 4 best had up to 5100 points with a $LLK_{free}$ lower than 148. The purpose of keeping all 10 solutions is to evaluate the efficiency of combining imperfect solutions, which is often the case in CDI. The shown figures were tested against several generated set of solutions with similar results.

The most intense mode (Fig. 3a)) represents 76% of the 10 solutions, and is similar to the average (Fig. 3b)) but with slightly reduced background noise outside the main object (also see the line cut in Fig.3c)). The secondary modes (e.g. Fig. 3d) and e)) can be used as an indicator of where the computed solutions have more diversity. The PRTF (Fig. 3f)) shows that the first decomposed mode yields a higher resolution than the average solution, but remains inferior to the first mode of only the 4 best solutions (this mode then represents 99% of the selected 4). Note that when only a subset of close-to-optimal solutions are selected, there is little difference between the first mode and the average (identical PRTF).

Example application to single cyanobacteria cells

As the reconstruction relies heavily on the tight support, it is easily obtained for binary objects like the ESRF logo already presented. However in the case of biological specimens (e.g. cells)[44, 45, 46] the support update often requires some careful optimisation, including hand-picking of the support area [46], as it is difficult to choose a threshold value for the automatic update of the support region using a shrinkwrap-based algorithm [5] or it will yield solutions with similar figures-of-merit which must then be sorted out e.g. using clustering analysis [47].

To test the usefulness of free log-likelihood for biological samples, we used the Coherent X-ray imaging Database [48] dataset #16[49], which includes the original diffraction data and the published experimental parameters[47].

We used the following procedure, with the goal to test unsupervised support and object optimisation:

Run 100 independent reconstructions with the following parameters: start from a large support 200 pixels in diameter, with a random object. First perform 2000 cycles of HIO with a positivity constraint and a detwinning procedure after 1000 cycles. Then proceed without the positivity constraint, with 2000 HIO cycles and then 2000 RAAR cycles. Finish with 200 ER cycles. The support was updated every 20 cycles, with a fixed relative threshold randomly chosen between 0.1 and 0.4. 2. 2.

Select the best 10 reconstructions based on the free log-likelihood, and perform an eigen-solution analysis. Note that during this analysis, the mask of free pixels was kept fixed (contrary to what is shown in Fig. 2) for all optimisations, in order to ensure the consistency of the free log-likelihood figure of merit.

The result of this optimisation is shown in Fig. 4. The results can be compared to the original publication, where 400 optimisations where performed for each dataset, and clustering analysis was performed to detect outliers with similar figures of merit (Fig. 6 in [47]). Our results are similar but require a less intensive computational approach.

Conclusion

In this article we have shown that using a free log-likelihood figure-of-merit allows one to evaluate solutions from CDI optimisations in a unbiased manner, despite the lack of a priori knowledge of the object support size and shape. Moreover using an orthonormal mode decomposition of the best solutions yields a better solution less prone to outlier results compared to the usual averaging approach.

The main advantage of this method is that it is completely generic, as it does not rely on any a priori knowledge on the sample (complex-valued or real-valued object, homogeneity of density or phase, etc.), and can thus be used for unsupervised phasing. Moreover this approach has a very low computational cost and can easily be implemented. This should be particularly relevant for high data throughput approaches which are now being developed with X-ray free electron laser and brighter synchrotron sources.

Acknowledgements

The authors would like to acknowledge Federico Zontone for help with the CDI measurements, and Pierre Thibault for a discussion on the mode decomposition as implemented in Ptypy [50], on which was based the eigen-decomposition for CDI proposed in this article.

Author contributions statement

VFN proposed the algorithms, contributed the PyNX library for computations, and wrote the main manuscript text. SL and YC provided critical feedback on the procedure and reviewed the manuscript. YC collected the experimental data for figures 1-3.

Additional information

The authors declare no competing interests.

Supplemental information

Free log-likelihood curve vs free pixels island size

As indicated in the main text, the free log-likelihood is calculated by setting aside $\approx$ 5% of the observed diffraction data in a ’free’ set of pixels, which are grouped as islands of radius 3 pixels - to make sure that correlations between neighbouring pixels does not create a strong relationship between the working set and the ’free’ set.”

In the following figure 5 we have performed the same calculations and plot as for figure 2 of the article, but by changing the island size with a radius varying from 0 (isolated pixels) to 6. The radius is indicated for each figure in the legend.

As can be seen in these figures, the free log-likelihood is not discriminating enough for the small island’s radii (0 and 1), as incorrect solutions with a large support still can yield small $LLK_{free}$ values. This is particularly true when the radius is equal to zero, as the $LLK_{free}$ has the same tendency as the normal log-likelihood, i.e. it is decreasing with an increasing number of pixels in the object support. This confirms that using islands with a sufficient radius is necessary to yield a discriminating figure of merit, in order to obtain a sufficient

Free log-likelihood curve for the cyanobacteria data

The curve was generated similarly to figures 2 and (suppl) 5, but for the cyanobacteria dataset presented in figure 4. The overall behaviour is similar to figure 2, with the normal log-likelihood decreasing with increasing number of points in the support, whereas the free log-likelihood presents a minimum around the ideal support size.

In this particular case the minimum is less pronounced than in Fig.2, due to the faceted shape of the bacteria which allows relatively easy convergence of the algorithm towards a correct shape.

Bibliography50

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Sayre, D., Chapman, H. N. & Miao, J. On the extendibility of X-ray crystallography to noncrystals. \Journal Title Acta Crystallographica Section A: Foundations of Crystallography 54 , 232–239 (1998).
2[2] Miao, J., Charalambous, P., Kirz, J. & Sayre, D. Extending the methodology of X-ray crystallography to allow imaging of micrometre-sized non-crystalline specimens. \Journal Title Nature 400 , 342–344, DOI: 10.1038/22498 (1999).
3[3] Miao, J. & Sayre, D. On possible extensions of X-ray crystallography through diffraction-pattern oversampling. \Journal Title Acta Crystallographica Section A Foundations of Crystallography 56 , 596–605, DOI: 10.1107/S 010876730001031 X (2000).
4[4] Miao, J., Hodgson, K. O. & Sayre, D. An approach to three-dimensional structures of biomolecules by using single-molecule diffraction images. \Journal Title PNAS 98 , 6641–6645, DOI: 10.1073/pnas.111083998 (2001).
5[5] Marchesini, S. et al. X-ray image reconstruction from a diffraction pattern alone. \Journal Title Phys. Rev. B 68 , 140101, DOI: 10.1103/Phys Rev B.68.140101 (2003).
6[6] Sandberg, R. L. et al. Lensless Diffractive Imaging Using Tabletop Coherent High-Harmonic Soft-X-Ray Beams. \Journal Title Physical Review Letters 99 , 098103, DOI: 10.1103/Phys Rev Lett.99.098103 (2007).
7[7] Zuo, J. M., Vartanyants, I., Gao, M., Zhang, R. & Nagahara, L. A. Atomic Resolution Imaging of a Carbon Nanotube from Diffraction Intensities. \Journal Title Science 300 , 1419–1421, DOI: 10.1126/science.1083887 (2003).
8[8] Huang, W. J. et al. Coordination-dependent surface atomic contraction in nanocrystals revealed by coherent diffraction. \Journal Title Nat Mater 7 , 308–313, DOI: 10.1038/nmat 2132 (2008).