Density Matrix Reconstructions in Ultrafast Transmission Electron Microscopy: Uniqueness, Stability, and Convergence Rates
Cong Shi, Claus Ropers, Thorsten Hohage

TL;DR
This paper provides a mathematical analysis of the inverse problem in ultrafast transmission electron microscopy, focusing on the uniqueness, stability, and convergence rates of density matrix reconstructions, complementing prior experimental work.
Contribution
It offers a theoretical framework analyzing the inverse problem's properties, including conditions for uniqueness and stability, and convergence rates, enhancing understanding of the reconstruction process.
Findings
Analysis of conditions for unique density matrix reconstruction
Stability estimates under various a-priori information
Convergence rates of reconstruction algorithms
Abstract
In the recent paper [17] the first experimental determination of the density matrix of a free electron beam has been reported. The employed method leads to a linear inverse problem with a positive semidefinite operator as unknown. The purpose of this paper is to complement the experimental and algorithmic results in the work mentioned above by a mathematical analysis of the inverse problem concerning uniqueness, stability, and rates of convergence under different types of a-priori information.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Density Matrix Reconstructions in Ultrafast Transmission Electron Microscopy:
Uniqueness, Stability, and Convergence Rates
Cong Shi Institute of Numerical and Applied Mathematics, University of Göttingen
Claus Ropers IV. Physical Institute, University of Göttingen
Thorsten Hohage Institute of Numerical and Applied Mathematics, University of Göttingen
Abstract
In the recent paper [17] the first experimental determination of the density matrix of a free electron beam has been reported. The employed method leads to a linear inverse problem with a positive semidefinite operator as unknown. The purpose of this paper is to complement the experimental and algorithmic results in the work mentioned above by a mathematical analysis of the inverse problem concerning uniqueness, stability, and rates of convergence under different types of a-priori information.
Keywords: Tikhonov regularization, variational source conditions, uniqueness, stability, electron microscopy, SQUIRRELS
1 Introduction
The density matrix is a fundamental notion in quantum statistics which describes the statistical state of an ensemble of identical single or many body quantum systems. It is a positive semidefinite operator of trace on the Hilbert space describing the state of a single quantum system. In the area of quantum optics, there are well-established techniques for characterizing the quantum state of the electromagnetic field in terms of its density matrix [14, 19]. Such ‘quantum state tomography’ facilitates the discrimination of, for example, coherent states, squeezed states, thermal states or photon number (Fock) states. In contrast, the reconstruction of the quantum state of a beam of free electrons has only recently been established using inelastic electron-light scattering [17]. The reconstruction technique, termed ‘SQUIRRELS’ for ‘Spectral Quantum Interference for the Regularized Reconstruction of free ELectron States’), is experimentally based on the principle of ‘Photon-Induced Near Field Electron Microscopy’ (PINEM) [2]. In PINEM, a beam of electrons is passed through the near field of laser-illuminated nanostructures or thin films, leading to the formation of sidebands in the electron energy spectrum, spaced by the photon energy [15, 6]. The spatially varying number of created sidebands yields the optical field strength with very high resolution on the nanometer scale [2, 16]. However, the quantum coherent nature of the electron-light interaction has also led to the observation of other fundamental quantum effects, such as multilevel Rabi oscillations [4, 7] or Ramsey-type phase interference in spatially separated fields [3]. Moreover, it has recently been shown that the interaction can be used to temporally structure electron beams into a train of attosecond pulses, the duration of which was determined by SQUIRRELS [17]. The various existing and future applications of inelastic electron-light scattering and the relevance of the specific electron state resulting from such interactions calls for a solid mathematical basis underlying the quantum state reconstruction scheme. The mathematical aspects of SQUIRRELS involve a linear inverse problem with the density matrix as unknown. In [17] this inverse problem was solved by Tikhonov regularization with positive semidefiniteness and trace constraints using quadratic semi-definite programming. The purpose of the present paper is to provide mathematical foundations of the SQUIRRELS method.
In the experiment in [17], light reflection from a thin graphite film mediates the interaction of free electrons with laser photons of two frequencies and and a controllable relative phase . As the interactions of electrons with laser photons lead to a comb-type energy spectrum of the electrons separated by the photon energy, the Hilbert space describing the state of an electron may be chosen as . The effect of the interaction of an -photon with a single electron is described by a unitary operator given in matrix representation by
[TABLE]
Here denotes the Bessel function of the first kind of order , and is a coupling constant associated with the laser. The effect of the photon-electron interaction on the free-electron density matrix is then described by
[TABLE]
where and are the density matrices before and after the interaction, respectively. However, only the diagonal values are observable. On the other hand, since the phase parameter is experimentally controllable, we may observe a spectrogram for each value of . The inverse problem to find the electron density matrix from the measured data is then described by the operator equation
[TABLE]
with a bounded linear forward operator between Hilbert spaces and given by
[TABLE]
The aim of this paper is to analyze this inverse problem mathematically concerning uniqueness, stability, and rates of convergence.
The plan of the remainder of this paper is organized as follows: In section 2 we formulate our main results. Section 3 is devoted to the proof that is injective. It is based on a factorization of which is also fundamental for the rest of this paper. Our main tool for the proofs of stability estimates and convergence rates are variational source conditions, which will be treated in section 4.
2 Main results
Our first main result asserts that the unknown density matrix is in fact uniquely determined by the experimental data in the absence of noise and modelling errors:
Theorem 1** (uniqueness).**
The operator defined in (3) is injective.
It follows from our analysis (see Corollary 3.2) and has been observed in numerical experiments that the inverse problem (2) is ill-posed. Therefore, a natural question concerns the degree of ill-posedness or the degree of stability that can be obtained under certain types of a-priori information on the true density matrix . This will be addressed in the following three theorems.
Moreover, to obtain stable reconstruction for noisy experimental data , some kind of regularization has to be employed. In [17] constrained Tikhonov regularization of the following form has been used:
[TABLE]
Here we minimize only over density matrices, i.e. positive semidefinte operators of trace . The regularization parameter is chosen by the discrepancy principle as follows: Let be the deterministic noise level, i.e.
[TABLE]
for the true density matrix . Then is chosen such that
[TABLE]
for some . We will also derive error bounds for Tikhonov regularization described by (4) and (5).
We first consider band-limited density matrices:
Theorem 2** (Hölder-type estimates for band limited ).**
Suppose the density matrices , , and are band-limited, i.e.
[TABLE]
for and some . Then the stability estimate
[TABLE]
holds true for some constant depending only on and . Moreover, for the Tikhonov regularized solution given by (4) with parameter choice rule , or with chosen by the discrepancy principle (5), the error bound
[TABLE]
is satisfied for all with independent of .
If we relax band-limitation to a polynomial or exponential decay condition, we only obtain slower than Hölder stability estimates and convergence rates:
Theorem 3** (Sub-Hölder rates under decay conditions).**
Suppose there exists such that the off-diagonal entries of the density matrices , , and satisfy either a exponential or a polynomial decay condition:
[TABLE]
and for all and . Then the stability estimate
[TABLE]
holds true for , and for the Tikhonov regularized solution given by (4) with chosen by the discrepancy principle (5), the error is bounded by
[TABLE]
where the function is given by
[TABLE]
for with some constant depending only on and , , and .
Note that the logarithmic rate for the polynomial decay condition (8a) is slower than as , but faster than for any . On the other hand, the rate for the exponential decay condition (8b) is slower that any Hölder rate for , but faster that any logarithmic rate. Such rates of convergence and stability estimates occur much less frequently than Hölder and logarithmic rates, but similar rates have been derived e.g. in scattering theory for the reconstruction of a near field data from far field data (see [9, Lemma 4.2]).
3 Uniqueness
The aim of this section is to prove Theorem 1. Throughout this paper we use the following notations:
Let ,
[TABLE]
denote the periodic Fourier transform. Since the scaling factor is chosen such that is unitary, we have
[TABLE]
Recall the periodic Fourier convolution theorem
[TABLE]
for . Fourier transforms on spaces of multi-variate functions will be labeled by superscript(s) indicating the position of the variable(s) on which they act. For example,
[TABLE]
It is easy to see that these operators are again unitary. (This can either be proved directly or by noting that they are tensor products of unitary operator, and .) In particular,
[TABLE]
Our basic tool for the uniqueness proof and also for the following sections is the following factorization of the operator :
Proposition 3.1**.**
The operator defined in (3) has the factorization
[TABLE]
Here and are defined in (LABEL:FourierTransformF1)–(12), is the multiplication operator
[TABLE]
and is a matrix shift operator defined by
[TABLE]
i.e. the -th column of is the -th diagonal of .
Proof.
By plugging the definition of into the definition of and using , we obtain
[TABLE]
Applying to both sides of this equation yields
[TABLE]
As , this simplifies to
[TABLE]
where we have used the substitution in the last line. Using the identity
[TABLE]
and setting
[TABLE]
leads to the formula
[TABLE]
In view of (12) it remains to show that . To this end we apply the Fourier convolution theorem (9) to obtain
[TABLE]
With the help of the identity for Bessel functions (see [20, eq. (9.20)]), we obtain
[TABLE]
Now the identity
[TABLE]
(see [20, eq. (9.19)]) shows that and completes the proof in view of (16). ∎
The factorization (13) leads us to a proof of the injectivity of .
Proof of Theorem 1:.
Since the Fourier-type transforms and and the operator in the factorization (13) are all unitary, it suffices to prove the multiplication operator is injective. To this end, we notice that the multiplier is a holomorphic function with respect to , so it only has isolated zeros. Therefore, if for some , we are able to infer that vanishes almost everywhere. This means in , which concludes the proof. ∎
Corollary 3.2** (ill-posedness).**
The inverse of is unbounded.
Proof.
It follows from the factorization (13) that
[TABLE]
All operators on the right hand side are unitary except the multiplication operator . A multiplication operator on is bounded if and only if the multiplier function is essentially bounded. Since the Bessel function has zeroes at [math] for , is not essentially bounded. ∎
4 Variational source conditions
In this section we review variational source conditions and verify conditions of this type, which then imply both the stability estimates and the convergence rates in Theorems 2 and 3.
4.1 Basic theory
We first recall some standard regularization theory for inverse problems. We consider a general linear ill-posed inverse problem where is a bounded linear operator between Hilbert spaces, which does not have a bounded inverse. Let is the noisy data with noise level . Tikhonov regularization with constraint set and regularization parameter is given by
[TABLE]
To obtain bounds on the reconstruction error , which tend to [math] as the noise level tends to [math], we need to impose conditions on . Such conditions are usually referred to as source condition. Classically, the source condition is of the form
[TABLE]
where is some “source” and is some increasing function, which determines the convergence rate. The usefulness of such spectral source conditions is linked to the applicability of tools from spectral theory which is mostly restricted to linear reconstruction procedure. Even for linear reconstruction procedures such as unconstrained Tikhonov regularization they are only sufficient, but not quite necessary for certain convergence rates. Both of these shortcomings can be overcome by the use of source conditions in the form of variational inequalities (see [10, 18, 5, 21]):
Definition 4.1** (Variational Source Condition).**
A function is called an index function if it is continuous, strictly increasing, and . We say that satisfies a variational source condition with index function if
[TABLE]
Conditions of the form (18) lead to the following error bounds in terms of the index function (see [8, 5]):
Proposition 4.2** (convergence rates with a-priori choice of ).**
Consider Tikhonov regularization given by eq. (17). If satisfies a variational source condition (18) with some concave, differentiable index function and if the regularization parameter is chosen by , then the following error bound holds true:
[TABLE]
Actually eq. (18) only needs to hold for all . The same convergence rate can also be achieved by the discrepancy principle, which does not require prior knowledge of the index function encoding properties of the unknown solution.
Proposition 4.3** (convergence rates with discrepancy principle).**
Consider Tikhonov regularization given by eq. (17). If satisfies a variational source condition (18) with some concave, differentiable index function and if the regularization parameter is chosen according to the discrepancy principle (5), then the following error bound holds true:
[TABLE]
Proof.
The proof is adapted from [11, Theorem 4.3(iii)], see also [5]. We first notice that since is defined to be the minimizer of the Tikhonov functional (17), it follows from (5) that
[TABLE]
Combining this inequality with (18) yields
[TABLE]
As
[TABLE]
and is monotonically increasing, we obtain . Now the proof is completed by noting that as is concave with . ∎
Here we do not address the question whether a parameter satisfying (5) exists. We refer to [1] for the so-called sequential discrepancy principle, which determines by an explicit algorithm for which similar error bounds can be shown.
Remark 4.4** ([12, Eq. 6]).**
If the variational source condition (18) is satisfied for all in some subset , then the conditional stability estimate
[TABLE]
holds true for all .
To verify variational source conditions for our problem, we will check the sufficient conditions in the following lemma, which is a special case of [13, Theorem 2.1] (where a different scaling is used such that the index function in [13] is twice the Bessel function here):
Lemma 4.5** (Verification of VSCs).**
Let and be Hilbert spaces and be a family of subspaces. Suppose that there exists a family of orthogonal projection operators for in some index set , and there exist families , of positive numbers, such that the following conditions hold true:
- •
* for all ;*
- •
;
- •
For all and all we have
[TABLE]
Then the true solution satisfies the variational source condition (18) with the index function
[TABLE]
4.2 Tools for the verification of variational source conditions for
In order to verify the VSC for our forward operator , we first look at the decomposition in Proposition 3.1. The fact that and are all unitary operators implies that, in order to analyze the properties of the operator , it suffices to analyze the properties of the multiplication operator as defined in (14). We write the forward problem as
[TABLE]
where is defined as
[TABLE]
and will verify the three conditions in Lemma 4.5 for the multiplication operator and the true solution .
For any , we define the sublevel sets
[TABLE]
with such that . We choose as orthogonal projections from to . Obviously, can be written as a multiplication operator
[TABLE]
with the characteristic function of .
To bound the sizes of the sublevel sets of , we recall some properties of the Bessel functions which can be found in [20, Ch. 9]:
. Hence, it suffices to look at the case . 2. 2.
For any , has a zero of order at . Around this zero it has the asymptotic behavior as . 3. 3.
also possesses an infinite number of simple zeroes on the positive real axis, and for all . 4. 4.
The positions of the positive zeroes tend to infinity as , i.e. .
These properties translate into the following facts on and its zero set :
Lemma 4.6**.**
Let be defined as in (14). Then we have the following:
* and for all and .* 2. 2.
* for all , and as .* 3. 3.
* is a finite set of simple zeros of for all .* 4. 4.
There exists such that for all .
Proof.
In view of the expression (14) for , the first three statements are immediate consequences of the first three properties of . For the last statement, note that . As , there exists such that for all . ∎
We further have the following properties of as :
Lemma 4.7**.**
For all there exists such that for all each connected component of contains exactly one point of . 2. 2.
For , the connected component of containing [math] has size as . 3. 3.
For all , the finitely many connected components of not containing [math] are of size as . 4. 4.
There exists a constant depending only on such that
[TABLE]
Proof.
1.) Since has only finitely many zeros of finite order, there exists such that is monotonic (or even and monotonic on in case of around ) on all connected components of viewed as subset of . Since is bounded, attains its infimum on the closure of this set, and the infimum is positive. Thus we obtain the claim by possibly reducing .
2,3) This follows from Lemma 4.6, parts 2 and 3, respectively. In part 2 we also use the Stirling approximation as .
- This follows from parts 1–3. ∎
Moreover, we need a bound on the supremum norm of :
Lemma 4.8**.**
Let be an arbitrary density matrix, i.e. is self-adjoint and positive semi-definite, with trace equal to 1. Let the operators and be defined as in (LABEL:FourierTransformF1) and (15), respectively. Then
[TABLE]
Proof.
Using the definitions of and , we can see that
[TABLE]
for all and all . The fact that is positive semi-definite implies that for all , the principal submatrix
[TABLE]
is also positive semi-definite. Therefore, calculating the determinant of this submatrix yields that
[TABLE]
so we conclude using Young’s inequality that
[TABLE]
holds for all . This in turn implies that
[TABLE]
which concludes the proof. ∎
Note that the third condition in Lemma 4.5 reduces to for all . Using the Cauchy-Schwarz inequality and elementary estimates this condition can easily be verified with . However, the following more elaborate argument along the lines of [13, Theorem 3.1] gives a sharper bound, which is optimal in some sense (see *****):
Lemma 4.9**.**
Suppose the first condition in Lemma 4.5 holds true for the projection operators defined in (23) and that is decreasing for some . Then the third condition holds true with
[TABLE]
Proof.
Due to the inequality
[TABLE]
we have to show that
[TABLE]
Introducing the function
[TABLE]
which by assumption is bounded by , we obtain
[TABLE]
using a partial integration in the last line and the fact that for such that . Now we use the assumption that is decreasing to estimate
[TABLE]
This completes the proof of (25). ∎
4.3 Proofs of Theorems 2 and 3
Proposition 4.10** (Hölder VSC for Theorem 2).**
Under the assumptions of Theorem 2 the matrix satisfies a variational source condition (18) with the index function and some depending only on and .
Proof.
From the band-limited assumption of , we know for . Therefore, it follows from Lemmas 4.8 and 4.7 that
[TABLE]
for all and some constant depending only on and .
Thus the first and second conditions of Lemma 4.5 are satisfied for . Note that the function is decreasing as . Using Lemma 4.9 with we deduce that the third condition holds true for . This implies a variational source condition with index function
[TABLE]
If we choose \varepsilon=\min\Big{(}\tau^{k_{0}/(1+2k_{0})},\varepsilon_{0}\Big{)}, we obtain for and else with positive constants , , depending only on and . This yields the assertion. ∎
Proof of Theorem 2.
The statement follows from the Hölder-type variational source condition in Proposition 4.10. In particular for the stability estimate (6) we note that and imply that all eigenvalues of lie in the interval , and hence . Hence, the stability estimate (6) follows from Remark 4.4 as only the behavior of of small is relevant. The Hölder-type convergence rate (7) follows from Proposition 4.2. ∎
Proposition 4.11** (VSC for polynomial decay).**
Under the assumptions of Theorem 3, case (8a) the matrix satisfies a logarithmic variational source condition (18) with
[TABLE]
for some depending only on , and .
Proof.
The decay condition on implies corresponding bounds on :
[TABLE]
Therefore, considering that , we have
[TABLE]
for some .
We will utilize both upper bounds of from Lemma 4.7. As , we get the bound for large . For some cut-off index to be determined later, we bound the sum by
[TABLE]
Using the logarithmic derivative , it can be seen that is increasing on the interval with . Therefore, as long as , there exist constants and depending only on , , and such that
[TABLE]
for . We choose such that both terms on the right hand side are approximately equal, i.e. or equivalently . Solving for yields the asymptotic relation
[TABLE]
Therefore, we set . Note that for sufficiently small. This yields with
[TABLE]
and the first and second conditions of Lemma 4.5 are satisfied. By Lemma 4.9 the third condition holds true with . Therefore, Lemma 4.5 yields a VSC with . Choosing , the first term is asymptotically neglectible against the second, and we obtain a VSC with . This completes the proof. ∎
Proposition 4.12** (VSC for exponential decay).**
Under the assumptions of Theorem 3, case (8b) the matrix satisfies a logarithmic variational source condition (18) with for some depending only on , and .
Proof.
The decay condition on implies corresponding bounds on :
[TABLE]
Therefore, again noting that , we have
[TABLE]
for some . We will utilize both upper bounds of from Lemma 4.7. As , we use the trivial bound for large . We choose a cut-off for and obtain
[TABLE]
for some generic constant that depends only on , and . Taking the logarithm of the first two terms shows that for our choice of both logarithms equal . This shows that
[TABLE]
Therefore, the first and second conditions of Lemma 4.5 are satisfied for . Note that the function is decreasing, Lemma 4.9 with allows us to deduce that the third condition holds true for . This implies a variational source condition with index function
[TABLE]
If we choose for , we obtain
[TABLE]
as . This yields the assertion. ∎
Proof of Theorem 3.
We set with the functions in the variational source conditions of Propositions 4.11 and 4.12. Then the statement follows from Proposition 4.2 and Remark 4.4. ∎
5 Conclusions
We have shown that the data acquired in the SQUIRRELS method (without noise and modelling errors) are indeed sufficient to uniquely determine the unknown electron density matrix. Moreover, we have estimated the intrinsic difficulty (or degree of ill-posedness) of the inverse problem to reconstruct a density matrix from these data under noise. As expected, the answer strongly depends on the type of available a-priori information on the unknown density matrix. If this matrix is band-limited, we obtain Hölder rates, whereas under polynomial decay conditions only logarithmic rates can be shown. For the most realistic exponential decay conditions the rates are in between Hölder and logarithmic rates.
We conjecture that both the stability estimates and the convergence rates are of optimal order under the given a-priori information if is considered as an operator defined on all bounded, Hermitian matrices. However, it is possible that the positive semidefiniteness constraint, which has a strong regularizing effect in numerical experiments, may be further explointed to improve rates.
Another topic of further research in this direction may be to extend the analysis of this paper to a model involving a continuum of energy states.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. W. Anzengruber, B. Hofmann, and P. Mathé , Regularization properties of the sequential discrepancy principle for Tikhonov regularization in Banach spaces , Applicable Analysis, 93 (2014), pp. 1382–1400.
- 2[2] B. Barwick, D. J. Flannigan, and A. H. Zewail , Photon-induced near-field electron microscopy , Nature, 462 (2009), p. 902–906.
- 3[3] K. E. Echternkamp, A. Feist, S. Schäfer, and C. Ropers , Ramsey-type phase control of free-electron beams. , Nat. Phys, 12 (2016), pp. 1000–1004.
- 4[4] A. Feist, K. E. Echternkamp, J. Schauss, S. V. Yalunin, S. Schäfer, and C. Ropers , Quantum coherent optical phase modulation in an ultrafast transmission electron microscope , Nature, 521 (2015), pp. 200–203.
- 5[5] J. Flemming , Generalized Tikhonov regularization and modern convergence rate theory in Banach spaces , Shaker, 2012.
- 6[6] F. J. García de Abajo , Optical excitations in electron microscopy , Rev. Mod. Phys, 82 (2010), p. 209.
- 7[7] F. J. García de Abajo, A. Asenjo-Garcia, and M. Kociak , Multiphoton absorption and emission by interaction of swift electrons with evanescent light fields , Nano Lett, 10 (2010), pp. 1859–1863.
- 8[8] M. Grasmair , Generalized Bregman distances and convergence rates for non-convex regularization methods , Inverse Problems, 26 (2010), p. 115014 (16pp).
