Relaxation theory for perturbed many-body quantum systems versus numerics and experiment
Lennart Dabelow, Peter Reimann

TL;DR
This paper develops an analytical framework to predict how isolated many-body quantum systems relax to thermal equilibrium under small perturbations, validated by numerical and experimental comparisons.
Contribution
It introduces a typicality-based analytical prediction method for quantum relaxation under perturbations, bridging theory with numerical and experimental results.
Findings
Analytical predictions match numerical simulations
Predictions agree with experimental observations
Most perturbations lead to similar relaxation behavior
Abstract
An analytical prediction is established of how an isolated many-body quantum system relaxes towards its thermal long-time limit under the action of a time-independent perturbation, but still remaining sufficiently close to a reference case whose temporal relaxation is known. This is achieved within the conceptual framework of a typicality approach by showing and exploiting that the time-dependent expectation values behave very similarly for most members of a suitably chosen ensemble of perturbations. The predictions are validated by comparison with various numerical and experimental results from the literature.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Relaxation theory for perturbed many-body
quantum systems
versus numerics and experiment
Lennart Dabelow
Peter Reimann
Fakultät für Physik, Universität Bielefeld, 33615 Bielefeld, Germany
Abstract
An analytical prediction is established of how an isolated many-body quantum system relaxes towards its thermal long-time limit under the action of a time-independent perturbation, but still remaining sufficiently close to a reference case whose temporal relaxation is known. This is achieved within the conceptual framework of a typicality approach by showing and exploiting that the time-dependent expectation values behave very similarly for most members of a suitably chosen ensemble of perturbations. The predictions are validated by comparison with various numerical and experimental results from the literature.
The long-standing task to adequately explain the time-dependent relaxation and ultimate equilibration of isolated many-body quantum systems has recently witnessed a veritable renaissance, driven, among others, by very impressive new experimental and numerical capabilities gog16 ; dal16 ; mor18 ; gem04 . On the other hand, quantitative analytical interpretations of the so-acquired data are still rather scarce. For instance, a paradigmatic setup for probing the temporal relaxation behavior of strongly correlated cold atoms was originally proposed in the numerical works cra08 ; fle08 and then experimentally explored in tro12 , concluding that “the exact origin of this enhanced relaxation in the presence of strong correlations constitutes one of the major open problems posed by the results presented here” f1 . The main objective of our Letter is to better understand several such experimental and numerical findings by considering model Hamiltonians of the form
[TABLE]
and asking how the temporal relaxation of the reference system is altered by a small perturbation .
Setting – Given a (pure or mixed) initial state , the system (1) evolves in time as (), resulting in expectation values of an observable (self-adjoint operator) . Of foremost interest to us are the deviations of those from the corresponding expectation values when the same initial state evolves according to the unperturbed Hamiltonian in (1).
Denoting by and the eigenvalues and eigenvectors of , we assume as usual gog16 ; dal16 ; mor18 ; gem04 that only energies within a macroscopically small but microscopically large interval entail non-negligible level populations . Put differently, the system must exhibit a macroscopically well-defined energy, while the number of is still exponentially large in the degrees of freedom lan70 ; gol10a , and the mean level spacing is approximately constant throughout gog16 ; dal16 ; mor18 ; gem04 . Very low system energies (close to the ground state) are incompatible with these requirements and are thus tacitly excluded. Finally, is assumed to be (practically) independent of the perturbation strengths considered in (1). Since the level density is – according to the textbook microcanonical formalism – directly connected to the system’s thermodynamics, we thus concentrate on sufficiently weak perturbations in the sense that phase transitions or other significant changes of the thermal equilibrium properties are ruled out.
Next we adopt the common idea of statistical mechanics that the system’s many-body character inevitably implies some uncertainties about the microscopic details. Hence, we temporarily consider an entire class of similar perturbations instead of one particular in (1). More precisely, denoting by the matrix elements of in the eigenbasis of , we choose an ensemble of matrices whose statistics still reflects the essential properties of the “true” perturbation in (1) as closely as possible. For instance, if models noninteracting particles and some few-body interactions, the true matrix is known to be sparse (most entries are zero) bro81 ; bor16 ; fyo96 ; fla97 . Similarly, the true perturbation may give rise to a so-called banded matrix deu91 ; fyo96 ; fei89 ; wig55 ; gen12 . Accordingly, it is appropriate to work with a possibly (but not necessarily) sparse and/or banded random matrix ensemble.
Denoting ensemble averages by , the ’s are considered as unbiased and – apart from – independent random variables, with second moments
[TABLE]
where the “band profile” approaches unity for small , and sets the overall scale. Moreover, is usually a slowly varying function of and upper bounded by a constant of order unity. In particular, the matrices may still be sparse, while characterizes the bandwidth if the matrix is banded, and otherwise. A more detailed discussion is provided in suppl .
Results – Our main result is the following analytical prediction of how the unperturbed behavior is modified by a typical perturbation, i.e., for the vast majority of ’s from a given ensemble,
[TABLE]
Here is the microcanonical expectation value corresponding to the energy window , and depends on the mean level spacing , the coupling , and the properties of the random matrix ensemble from (2). In particular, for sufficiently weak perturbations we find that
[TABLE]
where “sufficiently weak” means – besides the requirements in “Setting” – that . Likewise, we find that
[TABLE]
for (“sufficiently strong” perturbations), where are Bessel functions of the first kind. In between, exhibit as transition from (4) to (5) as detailed in suppl .
Before turning to their derivation, we further discuss and exemplify those results (3)–(5): So far, these are predictions regarding the vast majority of a given -ensemble. As usual in random matrix theory gol10a ; bro81 ; bor16 ; fyo96 ; wig55 ; deu91 , one thus expects that the “true” (non-random) perturbation in (1) belongs to that vast majority, provided its main properties are well captured by the considered ensemble. Note that the microscopic dynamics of a many-body system is commonly expected to be extremely sensitive against perturbations (chaotic haa10 ), so that it is virtually impossible to theoretically predict its response exactly or in terms of well-controlled approximations kam71 . Hence, some kind of “uncontrolled” approximation is practically unavoidable. One of them is random matrix theory, which in fact has been originally devised by Wigner for the very purpose of exploring chaotic quantum many-body systems, and is by now widely recognized as a remarkably effective tool in this context haa10 . Another common justification is by comparison with particular examples, to which we now turn. These will be state-of-the-art numerical and experimental results from previous publications, which, however, do not contain the corresponding (numerical) data for the variances required in (2). We will thus concentrate on the weak perturbations regime, where (4) applies. Again, the quantitative values of required in (4) can in principle be computed, but have not been provided in those publications, and will therefore be treated as fit parameters.
Examples – Our first example is the numerical exploration by Flesch et al. fle08 (see also cra08 ) of the bosonic Hubbard chain
[TABLE]
with periodic boundary conditions, creation (annihilation) operators (), and . For the initial state considered in Ref. fle08 and Fig. 1, the model (6) can be recast as an effective spin-1/2 chain by means of a mapping which becomes asymptotically exact for large interaction parameters giu13 . In the limit , the so-obtained “unperturbed” effective Hamiltonian amounts to an XX model. The leading finite- correction takes the form , where contains nearest neighbor, next-nearest neighbor, as well as three-spin terms giu13 . In other words, plays the role of the perturbation in (1), and that of . Since and are known (see fle08 and Fig. 1), the only missing quantity is in (4), which is, as mentioned above, not provided by fle08 , and hence treated as a fit parameter, yielding . The resulting agreement with the numerics in Fig. 1 speaks for itself.
An experimental realization of (6) by a strongly correlated Bose gas has been explored by Trotzky et al. in Ref. tro12 and is compared in Fig. 2 with our theory (3), (4). Adopting from before, this amounts to an entirely analytic prediction without any fit parameter. Incidentally, the agreement in Fig. 2 improves as increases. The same tendency is recovered when comparing the numerical results from Fig. 1(a) with the experimental data, suggesting that the model (6) itself may not capture all experimentally relevant details for large (small ).
Next we turn to the spin-1/2 XXZ chain with anisotropy parameter as specified in Fig. 3, exhibiting a gapless “Luttinger liquid” and a gapped, Ising-ordered antiferromagnetic phase for and , respectively bar09 . Similarly as before, in (4) is unknown and hence treated as a fit parameter. The analytics in Fig. 3 explains the numerics by Barmettler et al. from Ref. bar09 remarkably well all the way up to the critical point at .
Our last example in Fig. 4 is a model of hard-core bosons, numerically explored by Mallayya et al. in Ref. mal19 , and exhibiting so-called prethermalization ber04 ; moe08 ; mor18 for small . Treating again as a fit parameter, our theory also explains very well such a prethermalization scenario.
For lack of space, further examples have been moved to the Supplemental Material suppl , namely the fermionic systems from Refs. bal15 ; ber16 and a case in which exhibits a crossover from (4) to (5) without any fit parameters. Moreover, a scaling behavior as in (4) has also been reported, among others, in Refs. mor18 ; igl11 ; bie17 ; tan18 ; mal18 .
Derivation – Adopting the notation as introduced below (1), one readily confirms that
[TABLE]
where and . Given that in (7) only levels actually matter (see “Setting”), and that their spacings generically exhibit a Poisson- or Wigner-Dyson-like statistics haa10 of mean value , the random fluctuations of those will also be of order and (practically) uncorrelated. Hence, the exhibit typical deviations from their mean value of order . Recalling that is exponentially small lan70 ; gol10a and , it is reasonable to expect – and justified in more detail in suppl – that the deviations between and are negligible in (7) for all of later relevance. Employing this approximation and the unitary basis transformation
[TABLE]
we rewrite (7) in terms of the unperturbed basis as
[TABLE]
where , and .
The randomness of (see above (2)) is inherited via (1) by the eigenvector overlaps in (8). A particularly important role is played by their second moments, , which due to (2) only depend on ,
[TABLE]
Specifically, the function from (3) is defined as their Fourier transform,
[TABLE]
As a first basic result of our present work, we show in suppl that the function itself follows as from the ensemble-averaged (scalar) resolvent or Green’s function of , which in turn can be obtained as the solution of
[TABLE]
Along these lines one readily recovers suppl for weak perturbations as specified below (4) the previously known Breit-Wigner distribution wig55 ; deu91 ; fyo96
[TABLE]
with from (4), and provided that . Our next goal is to evaluate the ensemble-averaged time evolution (9), hence we need the average of four ’s in (10). The solution of this technically quite challenging problem by means of supersymmetry methods efe83 ; ber10 ; haa10 is provided in the Supplemental Material suppl . There, we also establish that is indeed exponentially small in the system’s degrees of freedom ,
[TABLE]
Introducing those averages of four ’s into (9) and (10) finally yields
[TABLE]
Here, is defined via with , essentially amounting to a “washed-out” descendant of the so-called diagonal ensemble (long-time average of ) gog16 ; dal16 ; mor18 . Following Deutsch deu91 , is thus commonly considered gog16 ; rei15 ; nat18 to closely approximate the microcanonical expectation value (see below (3)). Furthermore, we can infer from (12), (14), and (15) that , i.e., we recover (4). Finally, the last term in (16) is given by
[TABLE]
Due to similar arguments as in the above approximation , this term is usually negligibly small. For example, if the unperturbed system satisfies the eigenstate thermalization hypothesis (ETH), the are well approximated by deu91 ; sre94 ; rig08 ; dal16 . Observing that then immediately implies . However, similar cancellations so that can be shown to persist under much more general conditions dab20 (some ETH violating examples are also provided above and in suppl ). Altogether, we thus obtain
[TABLE]
see also Refs. mor18 ; bor16 ; sre99 ; bar08 ; bar09a ; gen13 ; san18 ; kni18 ; rei19 ; nat19 for somewhat similar findings.
The advantage of (19) over (16) is that and are often hard to determine in concrete examples. For the rest – especially if the above approximations leading to (19) might not apply – one also could continue to employ (16) (for an example, see Sec. I B in suppl ).
So far, we focused on sufficiently weak perturbations as specified below (4). Beyond this regime, the solution of (13) and hence from (12) will be different, yet we still recovered suppl the same final conclusion as in (19). In particular, for “sufficiently strong” perturbations as specified below (5) one finds wig55 ; fyo96 ; suppl that approaches a semicircular distribution with radius and hence one recovers (5), while in the intermediate regime, can still be readily obtained by solving (13) numerically.
Our final objective is to quantify the fluctuations about the average behavior in terms of the variance . In view of (9) and (10), averages over eight ’s are thus required. Referring to suppl for their quite tedious evaluation, we finally obtain the estimate
[TABLE]
where indicates the operator norm and is some positive real number, which does not depend on any further details of the considered system and is at most on the order of . Exploiting Chebyshev’s inequality, the probability that when randomly sampling ’s from the ensemble can thus be lower bounded by . In view of (15), the deviations from the average behavior (19) are thus negligibly small for most ’s and any preset . Moreover, one can show similarly as in rei16 that for most ’s the deviations must be negligibly small not only for any given , but even for the vast majority of all within any given time interval . Overall, we thus recover our main result from (3).
Conclusions – We derived an analytical prediction for the ensemble-averaged, time-dependent deviations of the perturbed from the unperturbed expectation values in isolated many-body quantum systems. Moreover, we showed that nearly all members of the ensemble behave very similarly to the average. Provided the ensemble has been chosen appropriately, the same behavior is thus typically expected to also apply when dealing with a specific physical model, resulting in our main analytical prediction (3). As a validation, we demonstrated good agreement with a variety of numerical and experimental findings from the literature. Technically speaking, substantial extensions of previously established, non-perturbative supersymmetry methods were indispensable to arrive at those results suppl . Related analytical fyo96 ; fyo95 and numerical rei19 studies suggest that such methods may in the future even be further extended to considerably more general ensembles than those we admitted here. On the conceptual side, a better understanding of when a given physical system is not a typical member of any permitted ensemble kni18 ; rei19a ; ham18 remains as yet another important issue for further studies.
Acknowledgments – Enlightening discussions with Jochen Gemmer and Charlie Nation are gratefully acknowledged. This work was supported by the Deutsche Forschungsgemeinschaft (DFG) within the Research Unit FOR 2692 under Grant No. 397303734.
I Further examples
In this section, we compare our main result (m3),
[TABLE]
with additional numerical examples, namely two fermionic systems in Secs. I.1 and I.2, and a sparse and banded random matrix model in Sec. I.3.
As already pointed out in the main paper, for the numerical results from the literature, with which we compare our analytical prediction (1), the necessary details to determine the function can in principle be calculated numerically, but they unfortunately have not been provided in the corresponding published works. Therefore, we employ the approximation , which is theoretically predicted to apply, at least, whenever the perturbation is sufficiently weak (see below (m4)). In the same vein, the quantitative value of the ratio in our theoretical prediction from (m4) has not been published in the pertinent numerical works and is therefore treated as a fit parameter.
While the above considerations apply to all the examples from the main paper and also those in Secs. I.1 and I.2 below, for the last example in Sec. I.3 all relevant quantities to determine in (1) will be exactly known.
I.1 Fermionic Hubbard model
We consider the fermionic Hubbard model as studied by Balzer et al. in Ref. bal15 , defined by the Hamiltonian with
[TABLE]
where () creates (annihilates) a fermion with spin at site , and . The notation indicates the sum over all pairs of connected sites in a Bethe lattice of infinite coordination number. This model was studied in bal15 using dynamical mean-field theory. The observable
[TABLE]
measures the correlation between conjugated momentum modes and (see bal15 for details).
Similarly as for the examples presented in the main paper, we compare the numerics from bal15 to our theoretical prediction (1) with . Moreover, since the proportionality constant between and [cf. Eq. (m4)] is not available from the original publication bal15 , it is again treated as a fit parameter. As Fig. 1 shows, we find good agreement between numerics and theory up to quite large values. The remaining deviations for large and small could hint at a banded structure of the perturbation matrix since the behavior in this regime is seemingly better described by a Bessel-like decay [see (m5) and the discussion between (m19) and (m20) in the main paper as well as the subsequent Sec. I.3 of this Supplemental Material].
I.2 Spinless fermions
Next, we turn to a model of spinless fermions in a periodic one-dimensional lattice, whose Hamiltonian consists of
[TABLE]
Here, () creates (annihilates) a spinless fermion at site , and . Bertini et al. ber16 investigated the effect of the next-nearest neighbor tunneling on the dynamics by means of equations-of-motion techniques for a lattice of sites. The observable is a tunneling correlation
[TABLE]
and the system is prepared in a thermal state with respect to from (4a) with dimerization parameter and nearest-neighbor interaction at inverse temperature . Quenching to and , the numerical result from ber16 for the unperturbed dynamics () is shown as a dashed black curve in Fig. 2. Since the system is nearly integrable for , this dashed black curve settles down to a nonthermal prethermalization plateau within the time scales shown ber16 . Upon activating and increasing , the system departs farther and farther from integrability, and the time-dependent expectation values approach the thermal value better and better for large times.
Yet, as already observed in Ref. ber16 , the numerically approached long-time limit still appears to notably deviate from the thermal value for small , presumably due to the finite system size in combination with the still rather mild integrability breaking, and possibly also due to non-negligible numerical inaccuracies at large times. For this reason, instead of comparing the numerics from ber16 with the analytic approximation from (1), we rather employed – as announced in the main paper below Eq. (m19) – our more general original result from (m16). In doing so, we utilized, as before, the approximations and , and we fitted the ratio , yielding . Moreover, we again used for the unperturbed behavior in (m16) the numerical results for (dashed black line in Fig. 2). But unlike before, we now also treated the long-time limit in Eq. (m16) as a fit parameter. The resulting agreement between numerics and analytics in Fig. 2 is very good for all numerically available -values. The actually employed values of were for , respectively. In particular, with increasing the long-time limit indeed seems to converge towards some constant (presumably thermal) value.
I.3 Sparse and banded random matrix model
In order to have detailed control of all relevant parameters and to obviate any sort of fitting, we finally consider a model of sparse and banded random matrices, which constitutes a direct implementation of the approach from the main paper (see also Sec. III.1 below): The unperturbed Hamiltonian consists of equally spaced eigenvalues with level spacing . The matrix elements of the perturbation in the eigenbasis of are independent random variables (apart from ), distributed as follows: With probability we set , so that the matrix is sparse with about of the matrix elements vanishing. The nonzero elements, in turn, have Gaussian distributed real and imaginary parts with mean zero and variance if , while those with are purely real and of unit variance. For the band profile , we choose a sharp cutoff at , i.e., with the Heaviside step function . Altogether, we thus obtain the first two moments and with , corresponding to (m2) in the main paper.
In the numerical example presented in Fig. 3, the total Hilbert space dimension is . For the initial state, we choose the eigenstate of the unperturbed system with , i.e., from the middle of the spectrum. As the observable, we choose , too, measuring the survival probability of the initially populated energy level . Consequently, the unperturbed time evolution is constant, , while the thermal expectation value is . The theory (1) thus predicts for the perturbed dynamics, where, as in (m12),
[TABLE]
is the Fourier transform of from (m11). As stated in the main paper and derived in Sec. IV.2 below, the function exhibits a crossover from the Breit-Wigner form (m14) for weak perturbations in comparison with the energy scale of the perturbation bandwidth ,
[TABLE]
to the semicircular shape [cf. above (m20)] for stronger perturbations,
[TABLE]
For the Fourier transform from (6), this implies a crossover from an exponential decay with rate ,
[TABLE]
to a Bessel-like decay with rate ,
[TABLE]
where is a Bessel function of the first kind.
Choosing a fixed bandwidth , we compare these two theoretical predictions with numerical results for a single realization of the above specified random matrix ensemble in Fig. 3. We find excellent agreement between the numerics and the exponential decay (9) for sufficiently small values of , and equally splendid agreement between the numerics and the Bessel-like decay (10) for sufficiently large values of . Moreover, the crossover between the two regimes around is nicely illustrated. Finally, we observe that upon increasing , deviations from the exponential behavior become pronounced at short times first. This can be traced back to the tails of , which begin to decay faster than in the Breit-Wigner distribution (7) when approaches the bandwidth fyo96 ; wig55 , i.e., for large , and hence the deviations of the Fourier transform (6) from the behavior in (9) will be most pronounced for small .
Remarkably enough, the deviations for small appearing at larger perturbations strengths in the above Fig. 1 as well as (less pronounced) in Figs. m1, m3, and m4 of the main paper seem to be of somewhat similar character as those in Fig. 3, suggesting that they might be caused by a finite bandwidth of the corresponding perturbation matrix .
II Justification of Eq. (m10)
In this section, additional details regarding the line of reasoning below Eq. (m7) and thus the justification for approximating in (m7) by , leading to Eq. (m10), are provided.
Our starting point is Eq. (m7), i.e.,
[TABLE]
As said below (m1), we assume that only energies within some macroscopically small but microscopically large interval exhibit non-negligible populations of the corresponding energy eigenstates . Accordingly, we approximate all with by zero, and by exploiting the Cauchy-Schwarz inequality , we can conclude that only summands with actually contribute in (11).
Temporarily, we thus restrict ourselves to labels and with . Moreover, we assume without loss of generality that the ’s are ordered by magnitude.
Again as said below (m1), we furthermore focus on intervals so that the (locally averaged) level spacings are approximately constant throughout , and we denote this constant by .
Our next assumption is that the level spacings exhibit some well-defined statistical distribution, for instance a Poisson or a Wigner-Dyson statistics, or some other (intermediate) statistics with qualitatively similar features. It is well-established haa10 ; bro81 that this assumption can be taken for granted for practically any Hamiltonian . We furthermore assume that the level spacings are statistically independent of each other. Again, it is well-known haa10 ; bro81 that this assumption is generically satisfied in very good approximation, although neighboring level spacings strictly speaking still exhibit some very weak correlations, which further decrease very rapidly with increasing distance between the level pairs. Such correlations could still be included in the subsequent considerations without changing the final conclusions, but for the sake of simplicity we henceforth disregard them.
In other words, the level spacings are considered as independent, identically distributed random variables with mean value . With respect to their variance, one readily sees that, independently of whether we are dealing with a Poisson, a Wigner-Dyson, or some other (reasonable) statistics, the variance is always on the order of . Since the are independent and identically distributed, it follows that is a random variable with mean value and variance of order for an arbitrary but fixed “reference index” (with ). It follows that , where is an -independent constant and is a random variable of mean zero and variance on the order of . Furthermore, since the expectation value in (11) remains unchanged if all energies are shifted by the same constant value, we can assume without loss of generality that and thus .
Next we exploit the following rigorous result, derived in Appendix B of Ref. rei19a : If each is changed into then the corresponding change of the expectation value in (11) is upper bounded (in modulus) by , where is the operator norm of . Combining this bound with the considerations in the previous paragraphs, one sees that will be of the same order as and that can be upper bounded in very good approximation by and thus by the width of the interval . It follows that is (approximately) upper bounded by .
Altogether, can thus be replaced by without any significant change of the expectation value in (11) as long as . At this point, abandoning our temporary restriction to is trivial. Finally, since is exponentially small in the systems’s degrees of freedom (see main paper), the latter condition is fulfilled for all of actual interest later on, i.e., before the expectation value (11) very closely approaches its equilibrium (long-time average) value.
III Perturbation ensembles
In this section, we specify and discuss in detail the ensembles of perturbation matrices considered in the main paper and in the following sections. We recall that we are dealing with isolated quantum many-body systems whose Hamiltonian takes the form
[TABLE]
with a fixed unperturbed Hamiltonian and a random perturbation . For the latter, we will define in Sec. III.1 an ensemble of operators in terms of their matrix elements in the basis of ,
[TABLE]
whose distribution is motivated by the generic form of “real” perturbations in physical systems, as well as by certain invariance properties of such systems (Sec. III.2). Finally, we will explain in Sec. III.3 that the parameter
[TABLE]
as already introduced below Eq. (m14) in the main paper, is generically exponentially small in the system’s degrees of freedom [see also Eq. (m15)], a crucial prerequisite for the applicability of our theory to many-body systems.
III.1 Definition
We consider statistical ensembles of perturbations , where each member of the ensemble is given by a formally infinite-dimensional random matrix (13). The distribution of the matrix elements should be chosen such that the main properties of the “true” perturbation are emulated reasonably well. As explained in the main paper, the distribution should allow for potentially (but not necessarily) sparse and/or banded matrices with real- or complex-valued entries bro81 ; bor16 ; fyo96 ; fla97 ; deu91 ; fei89 ; wig55 ; gen12 . As already emphasized, e.g., in Ref. fyo96 , this random matrix approach should be contrasted with the much more common modeling of a Hamiltonian such as in (12) in terms of a Gaussian orthogonal or unitary ensemble (GOE or GUE haa10 ; bro81 ), which would be entirely inappropriate in the present context.
From a technical point of view, the central objects are the overlaps
[TABLE]
[see (m8)] between the eigenvectors of and . More precisely, the essential quantities underlying our central results (m16) and (m20) are averages over four and eight factors of such , respectively, which in turn build on the average of two such overlaps. This raises the question which of the above exemplified specific features of the ensemble are relevant with respect to those averages over several matrix elements.
The simplest connection between the ’s and ’s is obtained by treating in (12) in terms of elementary (Rayleigh-Schrödinger) perturbation theory. Quantitatively, such an approach may only lead to useful analytical approximations for exceedingly small values due to the extremely dense energy eigenvalues of (the level spacing decreases exponentially with the system’s degrees of freedom gol10a ; lan70 ) and the concomitant small denominators. Hence, nonperturbative methods will be indispensable for our present purposes. However, qualitatively it is quite plausible on the basis of such a perturbative approach that beyond those extremely small values, a large number of ’s will appreciably contribute to any given or products thereof. It is thus reasonable to expect that some generalized kind of central limit theorem (CLT) may apply, so that only a few basic statistical properties of the ’s will be actually relevant for the ’s. We point out that all these abstract considerations will become concrete in Sec. IV, in particular Sec. IV.2.
Notably, quite significant correlations between the may still be admitted, but like in the CLT, they should be largely irrelevant with respect to the ’s. In this spirit, we take for granted that all matrix elements can be treated as statistically independent random variables, apart from the trivial constraint since is Hermitian. Furthermore, we recall that we are working within a macroscopically small but microscopically large energy interval , so that the spectra of and are approximately homogeneous throughout [see also the discussion below (m1)]. Consequently, for the dynamics, only states with are actually relevant. Therefore, we can simply assume an infinite homogeneous spectrum by suitably extending the levels beyond the energy interval . This in turn implies that the statistics of the can only depend on the difference , and similarly for the [see also (m11) and the discussions above (m2)].
Altogether, it is thus reasonable to expect (and will be further corroborated later) that the most important properties of the ensemble are its first two moments. Indicating ensemble averages as in the main paper by the symbol , we moreover expect them to be expressible as [cf. Eq. (m2)]
[TABLE]
Here, the “band profile” function admits the possibility of banded matrices. As discussed below Eq. (m2) in the main paper, sets the overall scale, while should be slowly varying with , normalized so as to approach unity for small , and upper bounded by a constant of order unity for all . Notably, however, is not required to approach zero for large , i.e., the matrices generally may, but need not be banded, nor is required to monotonically decrease.
For the rest, the distribution of the will indeed turn out (as anticipated above) to be rather arbitrary provided that the dimension of the pertinent energy interval is sufficiently large (see Sec. IV.2 in particular). We therefore stipulate the probability distributions of the matrix elements to be of the explicit general form
[TABLE]
with , , , and where are generic probability distributions with vanishing mean and unit variance. For , the argument in (17) and (18) is understood to be real. For , the argument may be complex or real, depending on whether we are dealing with an ensemble of complex or purely real matrices (see also Secs. V.2 and V.3 below for additional formal details). Moreover, we generally take the corresponding with to depend on the absolute value only, an assumption we will elaborate on in the subsequent Sec. III.2 in more detail. Finally, the restriction to indices with in (17) is justified by the fact that .
Note that from (17), (18), and the properties of one readily recovers the relations (16) for the first two moments. [This is one main reason behind the somewhat involved structure of (18).] While the first two moments (16) are thus directly built into the formal definition of the probability distributions (17) and (18), all their remaining statistical properties are still very general.
In particular, the possibility of sparse matrices is accounted for by the -peak in (18). In most cases, one actually expects that the diagonal elements are non-sparse, i.e., , and that all off-diagonal elements exhibit (approximately) the same sparsity, i.e., for all . In this context, we also recall the definition of the effective bandwidth
[TABLE]
introduced below (m2), implying if the matrices are not (or only very weakly) banded.
From the distributions of the individual entries (17), we then obtain the total distribution of the matrix in the eigenbasis,
[TABLE]
Averaging over the ensemble of perturbations means an integral over the probability measure , where
[TABLE]
is the Lebesgue measure of all independent entries of and is understood as a shorthand notation for , and where we tacitly focused on complex-valued for in (17). Symbolically, taking averages over the ensemble may thus be written in the two equivalent forms .
III.2 Physical motivation
In the previous section we specified the random matrix ensembles that will be admitted in our subsequent calculations. There, as well as in the main paper, we also argued why and how the investigation of such ensembles may provide insight with respect to our actual objective, namely to describe the effects of some given (non-random) perturbation of the “true” physical system of actual interest.
Here, we further elaborate on such considerations. Specifically, we are concerned with the physical motivation for our assumptions regarding the functions in (18). For this purpose, the true (non-random) perturbation of actual interest will be temporarily denoted as , while the symbol is reserved for the members of the considered random ensemble.
Our first observation is that if one takes for granted that the probability distributions in (17) indeed depend only on the differences , then those distributions can be estimated from the “empirical distribution” of the true matrix elements by considering subsets with identical values of within the energy interval . Similarly, the various parameters and functions appearing in (18) can be (approximately) determined. Along this line, the appropriate choice of the random matrix ensemble is thus uniquely fixed by the given . Analogously, also some of the further assumptions about this ensemble might be “empirically checked”, for instance the statistical independence of the matrix elements, or the features of . Next we specifically address the assumed properties of the functions .
According to textbook quantum mechanics, multiplying each eigenvector of the unperturbed Hamiltonian by a factor of the form with an arbitrary but fixed phase does not entail any physically observable changes. On the other hand, by doing so, each matrix element acquires an extra factor . If we now choose a basis with an arbitrary but fixed set of randomly generated phases , it is quite plausible that the corresponding “empirical distributions” from above will be well approximated for any given index pair by a probability distribution in (17) which does not depend separately on the real and imaginary parts of , but only on the absolute value (see also Sec. V.3 below). Hence, the same must apply to the functions with , appearing on the right-hand side of (18).
Note that in the above argument we tacitly assumed that after randomizing the phases of the basis vectors , a reasonable description of the so-obtained model within the general framework of our present random matrix approach is possible at all.
Once the property that only depends on is established for one (random) choice of the phases , the same property of (and thus of ) can be readily recovered for any other (random or non-random) choice of the . Moreover, the underlying ensemble of random matrices must exhibit the following invariance: all its statistical properties remain unchanged when each random matrix element of every member of the ensemble is multiplied by a factor for any arbitrary but fixed set of phases .
Essentially, we thus can conclude that if a random matrix approach is possible at all, then focusing on such -ensembles does not amount to any significant loss of generality (see also rei19 ; gen12 ; rei15 ).
Finally, the above invariance of the -ensemble can also be employed to once again recover the relation (16a) for any . For a direct numerical verification of this relation by means of a concrete physical model system, see also Ref. beu15 .
As an aside we note that even in case the matrix of the true perturbation happens to be purely real in some specific basis, one may conjecture that a suitably chosen ensemble with complex-valued off-diagonal elements might still do the job as well as a purely real-valued ensemble (this issue will be continued in Sec. V.2 below).
Turning to the so far excluded case , we first observe that we can always add a trivial constant to the true perturbation such that all the with the property [cf. below (m1)] are zero on average. A similar argument as before may then be invoked: If the true perturbation can be reasonably described within a random matrix approach at all, then only random ensembles which exhibit the property seem appropriate to faithfully emulate the actual system at hand. One thus can infer that in (18) must be a probability distribution with real valued argument and vanishing first moment. (Note that in the case it is not required that should only depend on .)
We finally remark that the statistics of the diagonal random matrix elements is in fact well-known to be of minor relevance anyway rei19 ; fyo95 .
III.3 Small parameter
A general feature of all the subsequent calculations is that they amount to approximations in which the quantity
[TABLE]
plays the role of a small parameter, i.e., we always assume that . Here is given by (m11), i.e.,
[TABLE]
Our first remark is that the small parameter from (14) [see also below (m14)], which measures the inverse full width at half maximum of in the weak-perturbation limit, is almost, but not exactly identical to . To be precise, we will find in Sec. IV.2 that the two small parameters are related via
[TABLE]
Our second remark is that in most cases is found to assume its maximum at and hence . However, exceptions with are still possible, as exemplified in Fig. 1 of Ref. fyo96 .
Our next objective is to quantify the “smallness” of and hence of somewhat more precisely. Quite obviously, the actual value of depends on many details of the specific model under consideration, hence somewhat more general statements are only possible in terms of nonrigorous arguments and rough estimates.
As mentioned earlier, the mean level spacing is exponentially small in the system’s degrees of freedom lan70 ; gol10a . For a typical macroscopic system with, say, degrees of freedom, is thus an unimaginably small number (on the corresponding natural energy scale, which usually will be very roughly speaking on the order of Joule with ). Also for “mesoscopic” systems with considerably less extreme values (arising for example in cold-atom experiments or in numerical simulations), the level spacing is usually still expected to be extremely small on the natural energy scale of the system at hand.
Since due to (23), and given (24), it seems reasonable to expect, and will be confirmed by all our particular examples below, that is generically a rather slowly varying function of its argument , decaying on a characteristic scale of the order of from its maximal value at towards its asymptotic value zero for . Moreover, we recall that its Fourier transform from (6) governs the modification of the temporal relaxation in (m7) due to the perturbation . We thus can infer from (6) that the characteristic time scale of this modification will be of the order of , where we are still working in time units with [cf. below (m1)]. Since is exponentially small in the degrees of freedom , also is expected to be exponentially small in , at least if one disregards extremely weak perturbations, which entail modifications of the relaxation which are themselves exponentially slow in . All these considerations may thus be symbolically summarized like in (m15) as
[TABLE]
IV Eigenvector correlations from supersymmetry
In this section, we derive the results (m13), (m14), and (8) [see also below (m19)] for the “second moments” (23), as well as the related results for the “fourth moments” , solving the pertinent random matrix problem by means of supersymmetry methods. We lay out the general procedure in Sec. IV.1 before presenting the explicit calculations for the second and fourth moments in Secs. IV.2 and IV.3, respectively.
Throughout this section we tacitly restrict ourselves to matrices with complex-valued off-diagonals [see also the discussion below Eq. (18)]. The case of purely real matrices is recovered by combining the results from fyo96 with the approach from Sec. V below.
IV.1 Outline of the method
In this subsection, we sketch the general methodology on which the more detailed calculations in the subsequent Secs. IV.2 and IV.3 are based.
During most of the actual calculation, we choose a Hilbert space of large, but finite dimension so that , , and in (12) amount to matrices (in the eigenbasis of ). Eventually, we will let while keeping both the perturbation strength and the average level spacing fixed. In the considered ensemble of these Hamiltonians (12), is fixed and is a Hermitian and possibly sparse and/or banded random matrix distributed according to (20).
Eigenvector overlaps from resolvents.
As pointed out in the main paper and in Sec. III.1 above, the key task is to evaluate the ensemble average over products of matrix elements (15). Such products can be expressed in terms of the resolvent or Green’s function of the Hamiltonian , defined as the operator
[TABLE]
To do so, we choose an arbitrary but fixed energy level of and consider the matrix element , yielding
[TABLE]
Next we adopt the assumption that the ensemble-averaged second moments are (for arbitrary but fixed and ) sufficiently slowly varying with , so that the sum over in (27) can be approximated by an integral, i.e.,
[TABLE]
and similarly for higher-order moments. Our justification of this assumption is threefold: (i) Exploiting the assumption, we will later determine an explicit (approximate) expression for , which indeed turns out to be slowly varying with . Taking for granted that the exact solution for is unique and that our approximation is not fundamentally flawed, it follows that we have found a close approximation to the unique exact . (ii) In numerical examples we always found that indeed satisfies the assumption. (iii) In random matrix theory, such assumptions are routinely taken for granted without any further comment, while to our knowledge no counterexample has ever been observed.
According to (28), we thus can express the ensemble-averaged product of two eigenvector overlaps in terms of the advanced and retarded resolvents and , respectively. Analogously, for the fourth-order moments, we will later combine two such second-order expressions (details will be given in Sec. IV.3).
The advantage of the resolvent formalism is that the matrix elements of can be expressed as a Gaussian integral with kernel . Introducing the abbreviation with and for the argument of the retarded and advanced resolvents, the resolvent matrix element can be written as
[TABLE]
where , and the choice of the sign in the exponent ensures convergence since . The normalization factor , and even more so its average, are in general hard to compute because is a high-dimensional matrix. Nevertheless, it can be expressed conveniently by extending the Gaussian integral to anticommuting numbers.
Supersymmetry method.
Our method of choice for the evaluation of average resolvents uses so-called supersymmetry techniques efe83 ; zuk96 ; mir00 ; ber10 ; haa10 . The underlying concept of graded algebras and vector spaces introduces a set of anticommuting or Grassmann numbers with the defining property that for any two such elements. The above cited references provide an introduction to the linear algebra and calculus on the resulting superspaces.
The crucial observation is that for a Gaussian integral similar to (29), if performed over Grassmannian, anticommuting numbers and , we obtain
[TABLE]
where our normalization for the Grassmann integral is . Combining (29) and (30), we can thus write
[TABLE]
To condense the notation, we introduce a supervector with
[TABLE]
In the following, we will refer to the commuting component as the bosonic part, and to the anticommuting component as the fermionic part of the supervector . Similarly, for a supermatrix , we use the notation to address the bosonic-bosonic sector, and consequently for the bosonic-fermionic, for the fermionic-bosonic, and for the fermionic-fermionic sectors. Using the above definition of , we can then express (31) as
[TABLE]
Here, and for short. The Kronecker product in the exponent thus consists of a “Hilbert space” factor and a “superspace” factor . In a slight abuse of notation, we will omit the Kronecker products in the following and identify and with their respectively lifted counterparts and on the combined Hilbert-superspace, and similarly for other operators acting trivially on either of the two factor spaces. To obtain the average over all perturbations, we then have to integrate over the distribution (20) of , i.e.
[TABLE]
The necessary modifications for the computation of the fourth moments , involving products of two resolvents, will be further discussed in Sec. IV.3.
Outline of the algorithm.
The general procedure to evaluate expressions like Eq. (34) involves the following steps: First, we integrate over the matrix elements of the perturbation , which can be carried out straightforwardly as the integrals are of Gaussian type due to a (generalized) central limit theorem. The remaining superintegral then involves an exponent of fourth order in the supervector . Therefore, second, we invoke a Hubbard-Stratonovich transformation str57 ; hub59 ; efe83 by introducing an auxiliary integral over a supermatrix. Thereby, we reduce the dependence on in the exponent to second order. This allows us, third, to perform the (now Gaussian) integral over the supervector . Fourth and last, we evaluate the remaining integral over the Hubbard-Stratonovich supermatrix by means of a saddle-point approximation, making use of the large dimensionality of the considered Hilbert space. In the end, we thus obtain explicit expressions for (products of) averaged resolvents, which give us access to the averaged eigenvector overlap products according to Eq. (28).
IV.2 Second moment
In this subsection, we derive the nonlinear integral equation (m13), whose solution gives access to the function from (23). In addition, we solve the equation explicitly in the limiting cases of weak and stronger perturbations [compared to the scale set by the perturbation bandwidth, cf. (19)], yielding expressions (7) and (8), respectively, for . To this end, we compute the average of the second moment (28) by calculating the average resolvents (34).
Ensemble average.
Evaluating the ensemble average in Eq. (34) amounts to computing the average
[TABLE]
Recalling the definition of from (16b), and introducing and , the exponent on the left-hand side of (35) can be rewritten as , where
[TABLE]
In other words, amounts to a weighted sum of independent random variables of zero mean and unit variance. According to the central limit theorem (cf. also the discussion in Sec. III.1), we can thus conclude that approaches, for large , a Gaussian distribution with mean zero and variance , and analogously for .
Since all random variables only occur in combinations of the form (36) in the average resolvent (34), only the limiting distribution of actually matters. Hence, we may approximate the distributions from Eq. (17) by any other distributions with the same values of the mean and variance (since they lead, for a given , to the same limiting distribution for ). In particular, we may use
[TABLE]
approximating each matrix element by a normal distribution (with real argument for and complex for ). Evaluating the integrals over all () in (34), which factorize because all matrix elements are independent, we then obtain
[TABLE]
where ‘’ denotes the supertrace, i.e., for a supermatrix .
For later use, we also note here that the integrand is invariant under a transformation , for any (pseudo)unitary which satisfies .
Hubbard-Stratonovich transformation.
Our next step is a supersymmetric Hubbard-Stratonovich transformation str57 ; hub59 ; efe83 to get rid of the fourth-order term in in the exponent of (38). To do so, we introduce a set of supermatrices ,
[TABLE]
with real numbers , and anticommuting , . The choice of an imaginary entry ensures convergence of the following integral. Namely, the Hubbard-Stratonovich transformation is then based on the identity haa10
[TABLE]
where is the inverse of the Hilbert space matrix with entries . Moreover, denotes the collective measure of all with . Substituting (40) into (38), we obtain
[TABLE]
Comparing (38) and (41), we observe that the Hubbard-Stratonovich transformation essentially identifies . The integral over the supervector in (41) now has a Gaussian structure. Therefore, we can evaluate it straightforwardly to find
[TABLE]
Saddle-point approximation.
Keeping in mind the huge dimension of the considered Hilbert space and the fact that we will eventually take the limit , the remaining integral over the supermatrices can be evaluated using a saddle-point approximation ben99 ; mir00 ; haa10 . The reason is that the exponent in (42) is a sum of very similar terms, so that the exponential is strongly peaked around the global maximum of the real part of its argument and/or highly oscillatory except at points where its imaginary part becomes stationary. Hence the dominant contributions arise from the steepest saddle points in the complex, multidimensional plane, where the first variation of the exponent vanishes. Computing this first variation and multiplying by the matrix , we obtain the saddle-point equation
[TABLE]
To find the saddle points, we first look for diagonal solutions of this equation and consider one component of , explicitly indicating the dependence of the solution on both the unperturbed and perturbed energy levels. Any further solutions can be generated from diagonal ones by means of the (pseudo)unitary symmetry transformation introduced below (38) haa10 . Next, we substitute the variances from (16b) with as specified there, and we make use of the assumption that the energy levels are very dense and approximately uniformly distributed, so that the density of states is essentially [cf. discussion below (m1)]. Provided that the summands on the left-hand side of (43) are slowly varying with , which – as will become clear below – is satisfied for [cf. Eq. (m15) and Sec. III.3], we can approximate the sum by an integral and arrive at
[TABLE]
In fact, given the assumed homogeneity of the spectrum, the solution will only depend on the energy difference , i.e., can be rewritten in the form . This simplifies the equation further, resulting in the convolution-type relation
[TABLE]
Note that the sign of the imaginary part of has to match the sign of the imaginary part of , , for otherwise the integrand in (45) develops a pole. Anticipating the final structure of from (42), we introduce the function
[TABLE]
for which, consequently, must hold. Substituting into (45), we find that it satisfies
[TABLE]
which is Eq. (m13) from the main paper. Assuming that solves (47), we immediately find a diagonal solution
[TABLE]
of the saddle-point equation (43). As mentioned below (43), any further solutions of the saddle-point equation can be generated from the diagonal one by means of symmetry transformations satisfying . From the Hubbard-Stratonovich transformation (40), we understand that the supermatrix transforms as under this symmetry. But since the solution from (48) is proportional to the unit matrix, all transformed solutions collapse back onto the diagonal one, so that this is indeed the only solution of (43). We remark that this will be manifestly different when computing the fourth moment in Sec. IV.3. Furthermore, one also verifies straightforwardly that the second variation of the exponent in (42) with respect to is proportional to the unit matrix upon substitution of . Therefore, its superdeterminant is unity and the saddle-point approximation of the integral (42) is obtained from its integrand by plugging in the solution (48) for , so that
[TABLE]
Hence the function is appropriately described as the scalar average resolvent. Using [see below (m1)] and (28), and remembering that , we then find
[TABLE]
with
[TABLE]
Here, the last equality follows from [see (26)], and hence, due to (49), . This establishes the results around Eq. (m13) from the main paper.
The remaining task is to solve the defining equation (47) for the scalar average resolvent . Unfortunately, no general solution to this nonlinear integral equation is known to us. However, analytic solutions are available in the limiting cases of weak and strong perturbations compared to the scale of its bandwidth (19).
Weak perturbations.
Provided that in Eq. (47) decays much faster than (which we will verify self-consistently in the end), we can neglect the band profile in the integral altogether and approximate (by normalization), leading to
[TABLE]
From the full equation (47), we then obtain
[TABLE]
Next, we substitute this functional form back into (52) to determine the constant . The resulting integral needs to be evaluated in the principal-value sense, indicated by the symbol “” in the following. For , this leads to
[TABLE]
With (53), we thus find that the average scalar resolvent in the weak-perturbation or large-bandwidth limit is given by
[TABLE]
with from (7). Together with (51), this proves the result (7) and hence (m14). The self-consistency condition for the weak-perturbation approximation, meaning that is decaying much faster than the band profile , becomes as stated below Eq. (m4), provided that remains of order unity for .
Finally, as a technical remark, we note that the solution (48) with from (55) does not lie on the original contour of integration in (40), see also (39). However, the contour can be adjusted accordingly by shifting , which is allowed because the poles in the integrand at all lie on the same side of the real axis, below it for ‘’ and above it for ‘’, respectively.
Stronger perturbations.
In the opposite limiting case, when the band profile decays much faster than the solution of (47), we can approximate in the integrand by its central value . Using the definition (19) of the effective bandwidth, we then find
[TABLE]
where was defined as in (8). Solving this algebraic equation, we immediately obtain
[TABLE]
where we already incorporated the requirement that the imaginary parts of and should have opposite signs [see below (46)]. Together with (51), we then recover (8) and hence the result stated below (m19) for in the limit of a dominating overlap scale compared to the bandwidth. More precisely, this latter assumption is verified self-consistently if , as stated below (m5) in the main paper.
IV.3 Fourth moment
In this subsection, we evaluate the fourth moments of eigenvector overlaps when the perturbation strength is weak compared to its bandwidth, which is the physically most relevant case for generic many-body systems. More precisely speaking, we will derive the following results:
[TABLE]
to leading order in the small parameter from (22). As some of the steps are straightforward generalizations of the calculation of the second moment in Sec. IV.2, the presentation will be tightened in these places. Moreover, we immediately assume a normal distribution (37) for the perturbation matrix elements because the general case of sparse and banded perturbations can be reduced to this effective model as sketched in Sec. IV.2. Focusing, as announced above, on the weak-perturbation/large-bandwidth limit, which essentially amounts to a constant band profile (cf. Sec. IV.2), we will assume right from the beginning that the variance for all and control the interaction strength by the parameter .
Before turning to its derivation, we demonstrate by means of Fig. 4 that the analytical leading-order approximation (58) captures the actual and quite nontrivial behavior with great accuracy. Concretely, Fig. 4 provides a detailed comparison of the result (58) with a numerical computation of for an ensemble of random matrices as specified in the figure caption. Notably, the approximation seems to work surprisingly well already for relatively moderate values of the small parameter or, equivalently, for relatively moderate matrix dimensions .
Properties of the fourth moment.
The averaged fourth-order eigenvector overlap
[TABLE]
involves the components , , , and of two perturbed eigenvectors , in the basis of the unperturbed system. We begin by collecting a few properties and symmetries that follow directly from its general structure and the unitarity of the overlap matrix . First, there are two symmetry properties that are obtained from interchanging indices in Eq. (59). Namely, the result should
- (i)
be invariant under the exchange of labels and , i.e. , , ; 2. (ii)
be complex conjugated upon and .
Second, the fourth moment must relate to the second moment (50) when tracing out eigenvectors of the unperturbed or perturbed Hamiltonians. Hence the result should
- (iii)
reduce to the second moment when summing over or , i.e.
[TABLE] 2. (iv)
reduce to the second moment when summing over or , i.e.
[TABLE]
Furthermore, the Gaussian distribution of the perturbation matrix elements together with the Isserlis (or Wick) theorem (see also Secs. V.1 and V.3 below) imply that we always need to pair up factors of and . This will become obvious in the explicit calculations below. In any case, it already affirms the general structure (58a) with and to be determined.
To compute the correlator (59), we proceed as for the second moment and aim at extracting it from the product of average resolvents similarly to (28). As we are dealing with a product of four overlap matrices now, we have to consider products of two resolvents. A crucial observation is that we need to distinguish the cases (single eigenvector) and (two distinct eigenvectors).
Resolvent approach: single eigenvector.
In the first case, , we perform similar steps as those leading to Eq. (28) to express the fourth-order overlap in terms of resolvents, yielding
[TABLE]
Here as in Sec. IV.2. The symmetrization in the Greek indices becomes necessary because such symmetrized combinations of resolvents are the only quantities that can be computed as Gaussian integrals if the argument of both factors is the same. Indeed, the corresponding integral is essentially the same as in Eq. (33) except that we have four factors of in the integrand now to account for the fourth moment. Therefore,
[TABLE]
and similarly for , where the right-hand side was obtained using the Isserlis/Wick theorem. Averaging Eq. (63) over the ensemble of perturbations works analogously to the calculation of the second moment in Sec. IV.2, except that we now have four factors in the integrand instead of two. When going from the equivalent of Eq. (41) to the equivalent of Eq. (42), we need to use the Isserlis/Wick theorem again and obtain two inverse matrix elements in the remaining superintegral over the Hubbard-Stratonovich matrix ,
[TABLE]
Note that due to the constant variance of the perturbation matrix elements assumed in this subsection, a single Hubbard-Stratonovich matrix of the form (39) suffices in the transformation (40). Computing the -integral by means of a saddle-point approximation works completely analogously, so that we eventually find
[TABLE]
In the remaining terms in Eq. (62), the resolvents are evaluated at distinct arguments and in the products, e.g., . In this case, the calculation parallels the one for distinct eigenvectors, to which we will turn next.
Resolvent approach: distinct eigenvectors.
If the perturbed eigenvectors and in Eq. (59) are distinct, , the product of overlap matrices can be expressed in terms of the retarded and advanced resolvents by multiplying two expressions of the form (28). Defining , we obtain
[TABLE]
There is a decisive difference between the terms in the second line and the ones in the third line of this relation. In the second line, the imaginary parts of the argument of the two factors in each resolvent product are the same, so that we have a product of two retarded or two advanced resolvents. In contrast, the terms in the third line each involve one retarded and one advanced resolvent.
We inspect the first kind with energies shifted to the same side of the real axis first. Since , we now need two supervectors and of the form (32) in order to write the product of resolvents as a Gaussian integral,
[TABLE]
Averaging this expression over the ensemble of perturbations, the entire calculation for the second moment from Sec. IV.2 carries over, except that we have essentially two copies of the same integral haa10 . Most importantly, the resulting saddle-point equation still has a single solution proportional to the unit matrix. The upshot is that the averaged product of two advanced or two retarded resolvents factorizes into the product of two averages, i.e.
[TABLE]
where the single averages were given in (49).
The situation is manifestly different for the average product of one retarded and one advanced resolvent, which is the form of the remaining terms both in Eq. (66) for two distinct eigenvectors and in Eq. (62) for a single eigenvector. There, the saddle-point equation obtained after averaging over the ensemble of ’s does not have a unique diagonal solution anymore, but instead exhibits a whole manifold of degenerate saddles.
Expressed as a Gaussian integral, the product of two different resolvents takes the form
[TABLE]
where the matrices were defined below Eq. (33). To compress notation, we introduce a collective supervector with
[TABLE]
In the following, we will refer to the first two components (superscript 1) as the retarded and the last two components (superscript 2) as the advanced sector. Furthermore, we define the abbreviations
[TABLE]
as well as the diagonal matrices
[TABLE]
Note that and . With these definitions, we can write Eq. (73) in the more compact form
[TABLE]
To compute an explicit expression for the average of this equation, we then follow the recipe outlined at the end of Sec. IV.1.
Ensemble average and symmetries.
In the first step, we integrate Eq. (73) over the ensemble of perturbations specified by the distribution from Eqs. (20) and (37). Similarly as in Sec. IV.2, this leads to
[TABLE]
Before invoking the Hubbard-Stratonovich transformation, we inspect the symmetries of the integrand in this equation. To this end, we need to consider the relative location of the considered eigenstates and with respect to each other, quantified by the parameter . On the one hand, if the energy difference is on the order of the mean level spacing, , the two eigenvectors belong to close-by levels in the spectrum, corresponding to the regime where significant correlations due to the orthonormality constraint are expected. As is small compared to the typical scale of eigenvector correlations in this case, we can neglect the term to leading order and the integrand in Eq. (74) becomes invariant under pseudounitary transformations , with . Hence the integrand has an approximate pseudounitary symmetry for small .
On the other hand, if , the term is not negligible and the approximate symmetry breaks down. In this case, the approximation collapses to the result that would be obtained if we treated the eigenvectors as independent random variables, see also Sec. V. In the intermediate regime, the situation is much more subtle and will be discussed in more detail below, after we have derived a result for small .
Hubbard-Stratonovich transformation.
The Hubbard-Stratonovich identity applicable to transform Eq. (74) looks similar to Eq. (40) for the second moment,
[TABLE]
except that a single supermatrix suffices due to the constant variance . It is conveniently parameterized as zuk96 ; mir00 ; haa10
[TABLE]
The supermatrices and are Hermitian, and the parameter will be adapted so that the integration contour passes through the saddle points mir00 . The block-diagonalizing transformation matrix satisfies and thus belongs to the (approximate) pseudounitary symmetry group of the integrand in (74). Applying the transformation (75) to the integral (74), we obtain
[TABLE]
Using the Isserlis/Wick theorem, we can then perform the Gaussian integration over , leading to
[TABLE]
As before, the indices and for the supermatrix elements refer to the retarded and advanced components (corresponding to and ), and and denote the bosonic and fermionic sectors of superspace, respectively.
Saddle-point approximation.
To evaluate the remaining superintegral over , we employ a saddle-point approximation again. Upon variation of the exponent in (78), the resulting saddle-point equation reads
[TABLE]
Following the calculation to solve Eq. (43) for the second moment, we readily conclude that a diagonal solution of (79) is given by
[TABLE]
with as before. The crucial difference to the calculation for the second moment is that this diagonal solution is no longer proportional to the unit matrix. As observed below Eq. (74), the integral and hence also the saddle-point equation (79) has a pseudounitary symmetry for , mediated by linear transformation matrices with . Given the solution from Eq. (80), all matrices with pseudounitary are also (approximate) saddle points, and they all contribute equally because the exponent of Eq. (78) is invariant when is negligible. Therefore, we have to sum the contributions from all these saddles, i.e., we need to integrate over the symmetry group of transformation matrices .
Evaluating the integrand at the saddle points is straightforward and works analogously to the calculation for the second moment. Substituting and introducing a new integration variable , the remaining integral over the manifold of degenerate saddles then reads
[TABLE]
where denotes the integration measure, which we will specify in a moment after fixing a convenient parameterization for . Note that we used and hence in the exponent.
Setting up the saddle-point integral.
We observe that can be split into a product of matrices that are block-diagonal in either the retarded-advanced or the boson-fermion decomposition zir86 ; fyo94 ; haa10 . To this end, define supermatrices and as
[TABLE]
with anticommuting , , , , so that is unitary (), while is pseudounitary ( with ). With these and the diagonal matrix , the pseudounitary transformation matrix can be expressed as haa10 ; zir86 ; mir00
[TABLE]
where the left and right matrices are block-diagonal in the boson-fermion decomposition, and the middle matrix is block-diagonal in the retarded-advanced decomposition. Writing the bosonic and fermionic eigenvalues as with , , , the measure takes the form haa10
[TABLE]
Substituting (83) into the definition of and introducing new integration variables and , we find
[TABLE]
and the measure takes the form
[TABLE]
This completes the parameterization of the integrand in Eq. (81). To evaluate the integral, we need explicit expressions for the -submatrix of ,
[TABLE]
where
[TABLE]
For the exponent of the integrand in Eq. (81), we recall that the degeneracy of saddles occurs for , so that we can expand it to first order in . Using the saddle-point equation (79), we obtain
[TABLE]
Substituting the measure (86), the rational part (87) of the integrand and the exponent (89), the integral over the saddle-point manifold thus reads
[TABLE]
with
[TABLE]
Hence we sorted the terms by their dependence on the Grassmann variables.
Evaluating the saddle-point integral.
To calculate the integral (90), we analyze the various terms (91) individually. Naively, the Grassmann integral over all terms that do not contain the full set of anticommuting generators vanishes. This would imply that the only contributions come from the and terms. However, we observe that the integration measure is singular in the commuting variables at the boundary of the domain of integration where , corresponding to the “origin” , see Eq. (85). This singularity potentially leads to diverging integrals over the commuting variables, and the total integral of “” type has to be dealt with using the Parisi-Sourlas-Efetov-Wegner (PSEW) theorem par79 ; mck80 ; efe83 ; weg83 ; zir86 ; con89 ; haa10 , telling us that the value of the integral is given by the value of the integrand at the origin, i.e. for the “superrotation” matrix replaced by the identity. For our current problem, this means that or, equivalently, .
Inspecting the terms (91), the conditions of the PSEW theorem are met by the integrals of and . Consequently, we obtain
[TABLE]
and . For the integrals of and , the singularity in the measure is lifted by an additional factor in the integrand, rendering the commuting integrals convergent. Due to the incomplete set of Grassmann variables in these terms, the total integral thus vanishes, . Similarly, the bosonic integral of converges, so that . For the integral of , we note that the terms proportional to and , respectively, come with opposite signs, but are otherwise symmetric. Hence this integral has to vanish, too, . Finally, we are left with the integrals of and . The fermionic integral is evaluated straightforwardly, giving . Furthermore, the angular integrals over and both yield factors of compensating the corresponding factors in the measure. For the remaining integrals over and , the following identities gra07 ; erd54 turn out useful:
[TABLE]
if , and
[TABLE]
if . Here denotes the exponential integral function gra07 ,
[TABLE]
The integrals of and can then be computed by decomposing the integrand into partial fractions in and and using identities (93) and (94) as well as derivatives thereof. Denoting and , we obtain
[TABLE]
and
[TABLE]
To simplify these expressions further, we observe that the arguments entering the exponential integral functions and in both expressions depend on the large, real parameter [cf. Sec. III.3]. Using the asymptotic expansion as (with a branch cut discontinuity of in the imaginary part along the negative real line), we can approximate the integrals as
[TABLE]
and
[TABLE]
Collecting terms.
The nonvanishing contributions to the averaged retarded-advanced resolvent as given in Eq. (90) are Eqs. (92), (98), and (99). For the fourth moment of eigenvector overlaps, we also need the averaged advanced-retarded resolvent [see Eqs. (62) and (66)]. The contributions to this term are obtained from those of the retarded-advanced resolvent by exchanging all indices 1 and 2, i.e., , , and . Combining the two and letting , we then find
[TABLE]
where . Using [cf. discussion below (m1)] and combining Eqs. (49), (68) and (100) according to Eq. (66), we obtain a first estimate for the fourth moment of the overlaps involving two distinct eigenvectors and ,
[TABLE]
Similarly, we combine Eqs. (65) and (100) according to (62) to obtain an estimate of the fourth overlap moment of a single perturbed eigenvector ,
[TABLE]
Next, we can combine Eqs. (101) and (102) and write . In the following, we restrict ourselves to the leading-order contribution in the parameter from (22) or, equivalently, from (14), meaning that we keep only terms up to order .
We consider the terms in Eq. (101) for first. The function is of order , see (7). The function acts as a Kronecker- and thus does not contribute for . Keeping only terms up to order , we then find
[TABLE]
with
[TABLE]
Note that is already the same symbol as defined in (58b) above. For the single eigenvector correlations, we observe that they carry a prefactor in the combined expression. This effectively decreases the order of these terms by a factor of because we eventually sum over all perturbed eigenvectors and [see, e.g., Eq. (m10)], but the double sum reduces to a single sum for the single eigenvector correlations. Hence we only keep terms of order in Eq. (102), so that
[TABLE]
Altogether, we then obtain
[TABLE]
Symmetry restoration.
Exchanging indices and/or approximating sums over the spectrum as integrals as before, one readily verifies that this result satisfies the Properties (i), (ii) and (iii) of the fourth moment collected in the beginning of this subsection below Eq. (59). However, Property (iv) is violated: Substituting (107) into (61), we get
[TABLE]
where was defined below (m16),
[TABLE]
The first term on the right-hand side is the expected reduction, but the additional, nonvanishing second term spoils the symmetry. Notably, this violation is not an artifact of the leading-order approximation. Rather, it can apparently be traced back to the approximate character of the saddle-point degeneracy discussed below Eq. (80). If the pseudounitary symmetry with were perfect, all saddles obtained from the diagonal solution by pseudounitary rotations would contribute equally. However, as mentioned above, the degeneracy becomes exact only in the limit of close-by eigenvectors (i.e. ), and the approximation is still good for small on the order of the mean level spacing , but breaks down when . This latter limit is correctly reflected in Eq. (107), too, because then the term becomes negligible compared to . However, in the intermediate regime with , where the symmetry is neither perfect nor completely broken, the situation is much more subtle and, unfortunately, hardly analytically tractable. Fortunately, though, the symmetry property (iv) can be restored a posteriori.
To this end, we look for a correction term such that Property (iv) is retrieved when replacing . At the same time, the corrected term should still possess Properties (i) through (iii). Properties (i) and (ii) limit the possible form of the correction to a function of the invariants
[TABLE]
In addition, the correction should be of the same order in . Anticipating a structural similarity to the preliminary result (IV.3), we thus make an ansatz of the form
[TABLE]
and being constants, independent of all variables in (110). Property (iii) requires that . Solving for and using the constraints that all should be constants, we find that and . Hence the only free variable is, e.g., the coefficient . Substituting this ansatz into the defining equation (61) of Property (iv), we find that solves the equation. The correction term thus reads
[TABLE]
Setting , we finally recover our main result for the fourth moments from Eq. (58). We remark that this result also generalizes previous findings from Refs. nat18 ; ith18 , which were obtained by means of completely different approximations and under quite substantial additional restrictions. Yet another, and in fact more general, such approach will be elaborated in the subsequent Sec. V.
V Generalized Approximation
The central result (m16) from the main paper was derived by averaging (m7) over some suitably defined ensemble of perturbations . In doing so, the main task was to evaluate the average over the products of four matrix elements on the right-hand side of (m10) [cf. (59)], henceforth denoted as
[TABLE]
This task was accomplished in the previous section by means of rather demanding and lengthy supersymmetry methods. In the present section, an alternative way of evaluating such averages will be worked out. It is based on an approximation which considerably simplifies the actual calculations and which will lead to practically the same final result as the supersymmetric approach.
The general idea is to approximate the “fourth moments” on the right hand side of (113) solely in terms of the “second moments” from (23) and then to exploit the previously known results for . In this way, we will be able to treat, with relatively little effort, also more general cases than those which were explicitly worked out in the previous section.
V.1 Symmetry consideration
Our starting point is the following observation: If a given random perturbation with matrix elements results, for the perturbed Hamiltonian in (12), in a certain set of eigenvectors and corresponding matrix elements according to (15), then a perturbation with modified matrix elements leads to a modified basis transformation of the form , where the are arbitrary but fixed phases. Likewise, the are in principle once again arbitrary but fixed phases, which however are actually irrelevant, since they pairwise cancel in the fourfold products on the right hand side of (113). Explicitly, the corresponding modification of (113) takes the form
[TABLE]
According to Secs. III.1 and III.2, all statistical properties of the ensemble are independent of the phases . Therefore, the same must apply, in particular, to the average in (114). It follows that the average in (114), and hence also the average in (113), must vanish except if the phases pairwise cancel each other. In other words, each of the first two matrix elements (without the “star” symbol) must have a “partner” among the last two matrix elements (with “stars”), where being partners means that the second indices are equal. The latter is the case if and only if one of the following two conditions is fulfilled: (i) and ; (ii) , , and (the extra condition prevents double counting of the case ). We thus can infer that
[TABLE]
where is the Kronecker delta.
For later convenience, we can also conclude along similar lines as below (114) that
[TABLE]
for arbitrary , where is defined in (23). Furthermore, taking into account (113) and (115) we can infer from (m9) and (m10) that
[TABLE]
V.2 Real and complex random matrices
So far, it was always tacitly understood that the off-diagonal matrix elements are in general complex-valued. On the other hand, it is well known from textbook random matrix theory haa10 ; bro81 that if the system exhibits a certain symmetry (related to time inversion) then the eigenvectors can be chosen so that and for all , and . Whether the considered system exhibits this symmetry or not, and thus the pertinent random matrix ensemble is purely real or not, is known to be of great importance for instance with respect to the level statistics haa10 ; bro81 .
With respect to the quantities which are at the focus of our present work [namely, the perturbed expectation values in (m9), the function in (23), etc.], we found that ensembles with real-valued matrix elements behave practically indistinguishable from their complex-valued counterparts. For instance, in Sec. IV.2 we derived the result (7) for certain complex-valued ensembles, while the same result was obtained in fyo96 for the corresponding real-valued ensembles, and likewise for the result in (8). More precisely, this similarity between real- and complex-valued ensembles is found to only hold true under our usual assumption that in (22) is a small parameter (see (25)) and thus subleading terms in can be neglected. Put differently, with respect to the higher-order corrections, differences between the real and complex ensembles may still be possible.
Throughout this Supplemental Material, we confine ourselves to the complex case. Though omitted here, we also worked out the real case and always found identical final results in (120) apart from the higher-order corrections in . Unfortunately, we are not aware of a simple general argument why this is so. (The argument from Sec. III.2 may be considered as a first hint.) We also remark that in contrast to the final results, the intermediate steps often differ quite considerably. For instance, in the real case only phases would be admitted in (114), leading still to the same conclusion as in (115)–(119a) and (119c), whereas (119b) becomes invalid, and also the evaluation of and in the following subsections yields quantitatively different results.
V.3 Complex random variables
Complex random variables are well-established in probability theory, yet it may be worthwhile to collect some issues of particular interest in our present context.
A complex random variable is defined as a pair of real random variables via . Accordingly, its probability distribution amounts to the joint distribution of the two real variables. The corresponding probability density may thus be written either in the form or . Given the probability density , the expectation value of an arbitrary function can be readily determined as usual, in particular arbitrary “moments” of the form .
The distribution (or the random variable itself) is called circular symmetric if the probability density does not depend on and separately, but only on , i.e., it can be written in yet another alternative form, namely as . It is called Gaussian if for some , and non-Gaussian otherwise.
Similarly as in the discussion above (114), one sees that is a complex random variable in the above specified sense, which must exhibit the same statistical properties as for an arbitrary but fixed phase . Hence, the distribution of must be circular symmetric for any index pair . Moreover, one readily sees, similarly as above (23), that the distribution of does not depend on and separately, but only on the difference , and likewise for all its moments .
If the distribution is furthermore Gaussian, then one readily verifies that
[TABLE]
Since the unitary must satisfy the relation for any , we can conclude that , hence the random variable cannot be strictly Gaussian distributed. In the following, we will therefore not assume that the distribution is Gaussian. However, it will be taken for granted that the deviations from a Gaussian distribution are not too extreme, so that the fourth moment can be written in the form
[TABLE]
with some real-valued, non-negative function which remains of order unity for all . In other words, there exists a constant with the property that
[TABLE]
for all . (In principle, this requirement could still be considerably weakened). For instance, probability densities with extremely slowly decaying tails are thus excluded. Cases which are of interest to us but disobey (123) have to our knowledge never been observed so far.
Analogously, two complex random variables and give rise to a joint probability density , from which arbitrary expectation values can be deduced. They are called uncorrelated if and (analogous relations for and then automatically follow). A forteriori, they are called independent if can be written in the form .
Similar generalizations for more than two random variables are straightforward. In particular, if are multivariate Gaussian variables with zero mean, then the Isserlis (or Wick) theorem reads
[TABLE]
where the notation means summing over all distinct ways of partitioning into pairs and each summand is the product of the pairs. (The complementary relation is of less interest to us.) In particular, one readily recovers (121) as a special case.
Considering two arbitrary matrix elements and which are not identical, i.e., the index pairs and are different, it follows with (119a)–(119c) that they are uncorrelated in the above defined sense. On the other hand, they cannot be independent for the following reason: From the definition (15) of the unitary , one readily infers the usual orthonormality relations
[TABLE]
In particular, choosing arbitrary but fixed and focusing on , it follows that the value of for one specific is determined by the values of for all . Hence the random variable cannot be independent from all the other ’s with .
In some previous analytical investigations deu91 ; sre94 ; nat18 , it was taken for granted that the are, at least in sufficiently good approximation, independent and/or Gaussian random variables, and also in numerical examples we observed that these approximations seem to be fulfilled very well. However, we have seen above that neither of the two properties can be strictly true. Moreover, it has been observed for instance in gen12 that the deviations from a Gaussian distribution may not necessarily be negligibly small. Accordingly, one main point of our subsequent considerations is to still admit (small) deviations from strict Gaussianity and/or independence, and to carefully keep track of their effects on the final results in (120).
V.4 Simplest approximation
As seen above, the complex random variables appearing in (116) and (117) are uncorrelated but not strictly independent. In this section, we investigate the consequences of treating them as (approximately) independent nevertheless (but still admitting deviations from Gaussianity).
Assuming that the matrix elements in (116) and (117) are independent, and observing (119a)–(119c), one can infer that (116) must be zero unless each of the two matrix elements without a “star” symbol has a complex conjugate counterpart among the two remaining matrix elements (with “stars”). This prerequisite can be fulfilled in two ways: (i) ; (ii) and . It follows that
[TABLE]
Exploiting the above independence assumption once more, we can further conclude that unless , and likewise for the last term in (126), yielding with (23)
[TABLE]
Similarly, (117) takes the form
[TABLE]
By means of (127) and (128) one thus can rewrite (115) as
[TABLE]
where we introduced the function
[TABLE]
and where we exploited (122) and (23) in the last step. If the were Gaussian random variables, then the last term in (129) would vanish (see (121)), and the remaining first two terms could be readily recovered by exploiting the Isserlis theorem (124).
Introducing (129) into (120) yields, after a straightforward calculation, the result
[TABLE]
where is the state introduced below (m16) with matrix elements
[TABLE]
[TABLE]
Note that the three terms on the right-hand side of (131) are the direct descendants of the three terms in (129).
If the are approximated as Gaussian random variables then the quantity in (133) is zero, as can be concluded from the discussion below (130) . More generally, one can readily infer from (133) and that
[TABLE]
where denotes the operator norm of (largest eigenvalue in modulus). Taking into account (123) and (130), we can conclude from (133) that
[TABLE]
Estimating from above by and observing , it follows that , and with (22), (134) that
[TABLE]
Recalling the discussion of above (123) and that is a small parameter (see (25)), we can conclude that remains negligibly small in (131) even if the are not Gaussian distributed.
Altogether, we thus arrive at the general approximation
[TABLE]
Introducing the explicit result (7) for derived in the previous section, one readily recovers the approximation , while (109) becomes identical to
[TABLE]
Hence (137) is seen to be already somewhat similar to the more rigorous result (m16), which followed from the supersymmetry calculations of the previous section. Yet, the remaining differences are in fact quite serious, for instance: (i) At time the system is always in the same initial state , independent of the perturbation . It follows that both the first and the last terms in (137) must be equal to . Moreover, and (6) imply . Altogether, for the approximation (137) amounts to . However, is in general not a small quantity. (ii) If one chooses (identity operator), then one readily sees that the left-hand side of (137) as well as and on the right-hand side must be unity, yielding . As seen above, is unity for and usually is found to remain non-negligible also within a notable interval of times (recall our standard example ). Even for the particularly simple observable under consideration, (137) is thus far from being a satisfying approximation. For the same reason, the result (137) does not exhibit the correct transformation behavior if is replaced by with .
In conclusion, the idea at the beginning of this subsection, namely to approximate the matrix elements in terms of independent (not necessarily Gaussian) random variables, does not work sufficiently well for our present purposes.
V.5 Improved approximation
As seen in the previous subsections, the matrix elements are known to be pairwise uncorrelated, but treating them as statistically independent is a too strong simplification. The most immediate reason which comes to mind is that such an assumption leads to a violation of the orthonormality conditions, see the discussion below Eq. (125).
In order to quantify the violation of the orthonormality relations (125), we note that those conditions are satisfied at least on average, i.e., , as can be readily concluded from (23) and (119c). Assuming that the are independent, one furthermore can show, similarly as in Secs. V.3 and V.4, that
[TABLE]
where was defined in (109). Finally, one can show analogously as below (135) that , where is our small parameter from (22) and (25).
In other words, assuming that the are independent leads to violations of the orthonormality relations (125) which, however, are with very high probability very weak. [A quantification of this qualitative statement can be worked out along similar lines as below Eq. (m20).] Though treating the as independent might thus seem to yield a very good approximation, the so-obtained final result (137) actually exhibits unacceptably large inaccuracies. Intuitively, this may be understood as follows: Though the committed error is with very high probability very small for every single summand on the right hand side of (120), there are so many summands that the resulting total error turns out to be non-negligible.
For the same reason, though the are expected to be nearly Gaussian distributed, admitting small deviations – as it is done in our present approach – seems a potentially relevant issue: Even if each summand on the right-hand side of (120) entails with very high probability a very small error when approximating the as Gaussian distributed, the resulting total error in the final result may potentially be non-negligible.
In the following, our main idea is to improve the approximation from the previous subsection so that the orthogonality relations (125) are even better fulfilled.
In order to work this out in detail, let us focus as a first example on the evaluation of the specific average in (116). Furthermore, we choose two arbitrary indices and with , but then keep them fixed. Generalizing the setup from the previous subsection, we consider two sets (“vectors” with components labeled by ) of complex random variables and . The distribution of each given random variable is assumed to be identical to the distribution of the corresponding random matrix element , and analogously for and . But unlike the “true” random variables and , all the “auxiliary” random variables and are now assumed to be strictly independent of each other. Indicating the corresponding averages (or expectation values) by the symbol , we thus can infer from (119a)–(119c) that
[TABLE]
and analogously for the higher moments (cf. (122)). Similarly as before, the random variables and thus satisfy on average the orthonormalization conditions , , , but the single realizations do not strictly fulfill the corresponding (non-averaged) relations. In order to fix this problem, we next define yet another set of random variables, namely
[TABLE]
It readily follows that the two vectors and are now properly orthonormalized, i.e.,
[TABLE]
The central idea of the present section is to approximate the “true” random variables and in (116) for two arbitrary but fixed indices by the above specified “auxiliary” random variables and , i.e., we adopt the approximation
[TABLE]
The evaluation of (147) turns out to be still an extremely tedious task. Moreover, most of the effort is required to find out that one could have adopted right from the beginning the following additional approximations in (141)–(145) with only negligibly small differences in the final results in (120): (i) It is sufficient to consider Gaussian random variables and . (ii) The factors and can be approximated by unity. In particular, one finds by exploiting (140a)–(140d) that the relations
[TABLE]
are satisfied in very good approximation, i.e., up to higher order corrections in the small parameter from (22), (25). (Note that by adding suitable corrections of higher order in on the right-hand side of (140c) and (140d), it is even possible to turn (148a)–(148c) into exact equalities.) The basic reason for (i) is similar as in the (much simpler) example from Sec. V.4: A possible non-Gaussianity may only show up in terms which exhibit (at least) one “extra Kronecker delta” compared to the other terms, as exemplified by the last summand in (129). Therefore, the number of such terms which contribute to the multiple sums in (120) is relatively small compared to the number of the other terms. Their total contribution to the final result (120) thus amounts to a higher-order correction, as exemplified around (135). An intuitive explanation of (ii) is as follows: Similarly as in (139) one finds that the mean values of and are very close to unity and that the variances are very small. Hence, approximating them by unity amounts to very small corrections on the right-hand side of (141)–(143). Since each of those terms is already very small in itself, these corrections are even much smaller, hence their contribution to the final result in (120) is negligible. The same applies to the factor in (143). A better understanding of why approximating and by unity is fundamentally different from approximating by zero will only be possible after having seen the way in which acts in the following calculations.
Taking for granted the above simplifications (i) and (ii), we are thus left with the approximations
[TABLE]
where and can be considered as independent Gaussian random variables with mean values and variances as specified in (140a)–(140d).
Introducing (149) and (150) into (147) yields
[TABLE]
Due to our assumption that the random variables appearing in (151b)–(151e) are multivariate Gaussian variables with zero mean, the expectation values can be conveniently evaluated by means of the Isserlis theorem (124).
The term in (151b) amounts to those contributions which one would obtain in the absence of the last summand in (150), corresponding to the simple approximation adopted in Sec. V.4. Accordingly, must be identical to the previous finding (127) in the case , i.e.,
[TABLE]
Moreover, the remaining contributions , , must be the descendants of the last term in (150), i.e., they are solely due to our improved approximation.
Focusing on the right hand side of (151c), one readily sees that only partitions in the Isserlis theorem (124) may possibly result in non-zero contributions for which is paired with , and for which agrees with , i.e.,
[TABLE]
The last factor is given by (140d). The remaining factor can be readily evaluated along similar lines, yielding
[TABLE]
Due to the extra Kronecker delta, the second term on the right-hand side of (154) turns out to entail a correction in the final result (120) which is of higher order in the small parameter from (22), (25) compared to the contributions of the first term. In turn, the first term might seem to be a higher order correction compared to the contributions of from (152), since the former is of third and the latter of second order in , see also (22) and (25). However, it will turn out later that the “missing Kronecker delta” effectively compensates for this extra factor . Heuristically, this can be seen by summing over , , and both in (152) and in the first line in (154): In both cases, the result is unity, suggesting that both terms will also comparably contribute to the more complicated sums in (120). This “effect” pinpoints the above announced fundamental difference between approximating and in (141)–(143) by unity, and approximating in (143) by zero: entails new leading order terms, while and only give rise to new subleading order terms and to small corrections of the already existing leading order terms. Technically speaking, the basic mechanism is that the extra sum in (150) enables a pairing of the four factors in (153) without an extra Kronecker delta, while the same was not possible in the term from (151b) and (152), to which the extra sum in (153) did not contribute.
After an analogous evaluation of and , one can rewrite (151a) in the form
[TABLE]
The first term on the right-hand side of (155) derives from and thus agrees with the previous finding (127) for . The second line in (155) represents the crucial modification due to our present improved approximation. The symbol “h.o.” in the last line of (155) refers to a multitude of higher-order terms, which later turn out (by similar calculations as around (134)) to entail subleading corrections in the final result (120). One of them is the last summand in (154), and they all have the property that they exhibit (at least) four factors of Kronecker deltas or functions, while all leading order terms exhibit three such factors.
Turning to the case in (116), the question whether the two column vectors and ( and fixed, variable) are strictly orthogonal or not is obviously irrelevant. Hence one recovers the same result as previously in (127) if . The same applies to in (117) if , while in the case the previous result in (128) will be modified by additional terms, all of which however turn out to entail subleading corrections in the final result (120).
Introducing all these findings into (120) one obtains similarly as in (131)–(137) after a straightforward but somewhat lengthy calculation the approximation
[TABLE]
with , , and as in (131). The last summand in (156) originates from the higher-order terms in (155) as well as the corresponding higher-order terms arising in (see above). Similarly as in (136), one can show that
[TABLE]
implying with (22) and (25) that the last term in (156) is a negligible correction of higher order in .
Until now, we approximated the coefficients and in (141)–(143) by unity and the random variables and in (149)–(150) as Gaussian distributed. As pointed out below (147), going beyond these approximations gives rise to a flurry of additional subleading-order corrections, analogously to the correction encountered in (133)–(136). As a consequence, the factor on the right-hand side of (160) will be replaced by some larger factor, which, however, will usually still remain on the order of to .
Exploiting the definitions (6), (109), (132), and (159), one readily verifies that , , and . It follows with (157), (158) that and that if is independent of then for all . Though there does not seem to be a reason why this must always be so cas96 , in all concrete examples known to us the function from (23) exhibits the symmetry
[TABLE]
It then readily follows from (6) that and from (159) that .
On the one hand, (156) still shares some similarities with our previous approximation in (137), but it no longer exhibits the shortcomings discussed below (137). On the other hand, (156) generalizes (m16), which we previously derived in the main paper using the supersymmetry results from Sec. IV, when takes the form (7) and hence in (6) is given (in very good approximation) by . We note that in (m17) and in (157) exhibit the same essential properties (they vanish for , for , and if is independent of ), but quantitatively they are found to be not exactly identical in all details. On the one hand, this corroborates a posteriori the validity of our present ad hoc approximation, on the other hand, it also indicates the limits which such an approximation still may comprise at least in principle. However, in practice both and can be considered as negligible according to the line of reasoning below Eq. (m18) in the main paper. In the same vein, we can also adopt the approximation from the main paper, yielding
[TABLE]
From this result together with (6) and the considerations below (m20) in the main paper and Sec. VI below, one finally recovers (m3) with from (m12).
VI Variance of the time evolution
The objective of this section is to show that the deviation
[TABLE]
from the average is negligibly small for the vast majority of all members of the ensemble. In doing so, we will consider the same ensembles for which the average expectation values have been explored in the previous sections. As outlined in the main paper, this essentially amounts to establishing the upper bound
[TABLE]
[cf. (m20)] for the variance
[TABLE]
where is our small parameter from (22) and (25), is the operator norm of , and is some positive real number which does not depend on any further details of the considered system [in particular, it is independent of , , , and of the perturbation strength in (12)].
VI.1 Basic considerations
In this subsection, we further elaborate on the considerations in Sec. III, in particular on the heuristically expected (and numerically confirmed) similarities between our present random matrix problem and the central limit theorem (CLT) for random variables.
The basic observation is that a large number of perturbation matrix elements [see (13)] is intuitively expected to be of comparable relevance for the perturbed expectation value in (m9). (For simplicity, we may imagine in (163)–(165) as arbitrary but fixed.) In other words, on the left-hand side of (13) is a real-valued function of many, roughly speaking “equally important” arguments . If these arguments are furthermore randomly sampled according to the specific ensemble at hand, then it seems reasonable to expect – similarly as in the CLT – that the concomitant probability distribution of the real-valued random variable will be sharply peaked about the mean value . Or, in the words of Talagrand tal96 , “[a] random variable that depends (in a ‘smooth’ way) on the influence of many independent variables (but not too much on any of them) is essentially constant”. Basically (up to the “quantitative details”), this expectation is tantamount to the main result (164) of the present section.
CLT-like phenomena of the above kind are often referred to as “concentration of measure”, “typicality”, “self-averaging”, or “ergodicity” properties. Specifically in the context of random matrix theory, they are mostly taken for granted without any further comment. In other words, the main emphasis is put on evaluating ensemble-averaged properties, while the fluctuations about the average are tacitly assumed to be negligible, but are hardly ever explicitly considered.
In our present work, we do not take such a property for granted, but rather we derive it.
VI.2 Derivation of the upper bound (164)
To begin with, we rewrite by means of (m9) and (m10) in the compact form
[TABLE]
where we introduced the abbreviations
[TABLE]
and where the previously employed Greek letters have been replaced by Roman letters for notational convenience. Note that and are identical to the previous quantities and , and that
[TABLE]
according to (113).
As discussed in detail below (114), the average in (170) must vanish unless each of the matrix elements without a “star” has a “partner” among those with “stars”, where being partners means that the second indices are equal. Since there are two possibilities of pairing the ’s along these lines, (170) can be rewritten as the sum of two terms, namely (see also (115)–(118) and (172))
[TABLE]
where the extra factor in (173c) is needed to avoid double counting of the case . Finally, (166) can be rewritten as
[TABLE]
Similarly as in (167)–(171) one finds that
[TABLE]
Analogously as above (173a), one can conclude that the average in (176) must vanish unless the matrix elements with and without stars can be arranged into pairs with identical second indices. Since there are 24 possibilities of pairing the ’s along these lines, (176) must be of the form
[TABLE]
The explicit determination of the is a straightforward but somewhat tedious task, yielding
[TABLE]
Note that the appearance of the -factors (to avoid double countings) depends on the order in which the 24 cases are executed. In the above list, the first four Kronecker deltas in every given uniquely determine with which among the 24 possible cases we are dealing, and the sequence of the indices fixes the specific ordering of the 24 cases which we have chosen. Finally, (176) can be rewritten as
[TABLE]
Taking into account in (179b) the explicit expressions for from (178a)–(178x), one can infer that
[TABLE]
Effectively, we are thus left with 16 summands in (179a).
The main remaining task is to evaluate the ensemble averages over eight matrix elements on the right-hand side of (178a)–(178x). In principle, this might be doable by a corresponding generalization of the supersymmetric approach from Sec. IV. In practice, already for the averages over four matrix elements in Sec. IV, the explicit evaluation of the pertinent saddle-point approximation turned out to be at the limit of what seems analytically doable. Therefore, such an extension of the supersymmetric approach is beyond the scope of our present work. Instead, we will adopt the alternative approach from Sec. V.
We begin with recalling the evaluation of (173)–(174) by means of the approach from Sec. V.5, but employing a modified notation which is more suitable for the purpose of the subsequent generalizations. Focusing first on the cases with in (173b) and (173c), we introduce independent Gaussian random variables and with (see also (140a)–(140c))
[TABLE]
and . In a next step, we define the random variables (see also (149), (150))
[TABLE]
and we adopt the approximations (see also (147))
[TABLE]
for the two averages appearing on the right-hand side of (173b) and (173c). Note that in both cases the four indices actually amount to coinciding pairs. While we assumed so far, the corresponding approximations for are
[TABLE]
After introducing (182) and (183) into (184) and (185), those averages can be explicitly evaluated by means of the Isserlis theorem (124) together with (181a) and (181b). These calculations and the subsequent evaluation of (173)–(174) have been explained in detail in Sec. V.5.
Next we turn to the averages over eight matrix elements in (178a)–(178x). As before, we first focus on the cases where the four indices are pairwise distinct and we introduce four independent Gaussian random variables () with
[TABLE]
Next, the random variables are defined according to
[TABLE]
and we adopt the approximations
[TABLE]
for the averages appearing on the right-hand side of (178a)-(178x). Again, in all those averages the 8 indices actually amount to coinciding pairs. In the remaining cases where the four indices are not pairwise distinct, only a smaller set of indices with is actually needed.
After introducing (187) into (188) (or its counterparts if are not pairwise distinct), the averages in (178a)–(178x) can be explicitly evaluated by means of the Isserlis theorem (124) together with (186a) and (186b). (Due to (180a)–(180h) not all of them are actually needed.) Finally, one is left with evaluating (179a) and (179b). These calculations are in principle not very difficult, but in practice they amount to a daunting task due to the huge number of different terms. Omitting the details of those very lengthy calculations, the main results are as follows:
The second moment in (179a) is dominated by the contributions of the summands with indices . To leading order, they exactly cancel the corresponding dominating contributions to the last term in (165) which one obtains via (174a). All the remaining contributions by the two terms on the right hand side of (165) are found to be of subleading order in the small parameter from (22) and (25). Yet another flurry of subleading order terms arises for the same reasons as discussed below (160). Explicitly, one finally arrives at the upper bound (164). As a first, rather conservative estimate for the constant in (164) we furthermore obtained . It seems likely that this rough upper bound could still be substantially reduced by painstakingly searching among the very numerous contributions to (174a) and (179a) for terms which cancel each other partially or even completely. Here, we have confined ourselves to separately bounding every single term relatively generously and without taking into account possible cancellations. Even in this case, our detailed calculations extended over a very large number of pages.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) C. Gogolin and J. Eisert, Equilibration, thermalization, and the emergence of statistical mechanics in closed quantum systems, Rep. Prog. Phys. 79 , 056001 (2016).
- 2(2) L. D’Alessio, Y. Kafri, A. Polkovnikov, and M. Rigol, From Quantum Chaos and Eigenstate Thermalization to Statistical Mechanics and Thermodynamics, Adv. Phys. 65 , 239 (2016).
- 3(3) T. Mori, T. N. Ikeda, E. Kaminishi, and M. Ueda, Thermalization and prethermalization in isolated quantum systems: a theoretical overview, J. Phys. B 51 , 112001 (2018).
- 4(4) J. Gemmer, M. Michel, and G. Mahler, Quantum Thermodynamics , Springer, Berlin (2004).
- 5(5) M. Cramer, C. M. Dawson, J. Eisert, and T. J. Osborne, Exact Relaxation in a Class of Nonequilibrium Quantum Lattice Systems, Phys. Rev. Lett. 100 , 030602 (2008).
- 6(6) A. Flesch, M. Cramer, I. P. Mc Culloch, U. Schollwöck, and J. Eisert, Probing local relaxation of cold atoms in optical superlattices, Phys. Rev. A 78 , 033608 (2008).
- 7(7) S. Trotzky, Y.-A. Chen, A. Flesch, I. P. Mc Culloch, U. Schollwöck, J. Eisert, and I. Bloch, Probing the relaxation towards equilibrium in an isolated strongly correlated one-dimensional Bose gas, Nat. Phys. 8 , 325 (2012).
- 8(8) See end of Section 2 in Ref. tro 12 .
