Typicality of Prethermalization
Peter Reimann, Lennart Dabelow

TL;DR
This paper analytically demonstrates that prethermalization, a prolonged non-thermal state before eventual thermal equilibrium, is a common feature in a broad class of weakly perturbed integrable many-body systems.
Contribution
It extends Deutsch's framework to analytically predict the typicality of prethermalization in general weakly perturbed integrable systems.
Findings
Prethermalization occurs in a wide range of weakly perturbed integrable systems.
Analytical predictions confirm the typicality of prethermalization.
The framework generalizes previous understanding of relaxation dynamics.
Abstract
Prethermalization refers to the remarkable relaxation behavior which an integrable many-body system in the presence of a weak integrability-breaking perturbation may exhibit: After initial transients have died out, it stays for a long time close to some non-thermal steady state, but on even much larger time scales it ultimately switches over to the proper thermal equilibrium behavior. By extending Deutsch's conceptual framework from Phys. Rev. A 43, 2046 (1991), we analytically predict that prethermalization is a typical feature for a very general class of such weakly perturbed systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Typicality of Prethermalization
Peter Reimann and Lennart Dabelow
Fakultät für Physik, Universität Bielefeld, 33615 Bielefeld, Germany
Abstract
Prethermalization refers to the remarkable relaxation behavior which an integrable many-body system in the presence of a weak integrability-breaking perturbation may exhibit: After initial transients have died out, it stays for a long time close to some non-thermal steady state, but on even much larger time scales it ultimately switches over to the proper thermal equilibrium behavior. By extending Deutsch’s conceptual framework from Phys. Rev. A 43, 2046 (1991), we analytically predict that prethermalization is a typical feature for a very general class of such weakly perturbed systems.
Isolated many-body quantum systems are known to equilibrate, i.e., expectation values exhibit an initial relaxation and then spend most of their time close to a constant value, provided some rather weak preconditions are fulfilled. Furthermore, thermalization is expected for so-called non-integrable systems, i.e., the long-time behavior is well approximated by a microcanonical ensemble. (Possible exceptions, e.g., due to many-body localization, are tacitly ignored here.) In contrast, integrable systems usually exhibit quite significant deviations from such a thermal long-time behavior. All these issues have been extensively explored in the literature, as reviewed, among others, in Refs. gog16 ; ale16 ; bor16 ; mor18 . They are not the subject of our present work but rather will be taken for granted.
Our main issue is the question of how the temporal relaxation of an integrable system changes in response to a weak integrability-breaking perturbation. More specifically, we will derive a rigorous bound for the difference between unperturbed and perturbed expectation values, implying that those changes remain over a long period of time negligibly small for a very large class of weak perturbations. Our approach is conceptually akin to Deutsch’s seminal work on thermalization, treating the perturbations along the lines of random matrix theory deu91 . In particular, we will exploit Deutsch’s result concerning the ultimate thermalization of the perturbed systems. With respect to the unperturbed (integrable) system, we will moreover take for granted that its initial relaxation is not extremely slow, and that it exhibits clearly observable deviations from a thermal long-time behavior. Altogether, we are thus left with a very large class of perturbations with the following quite remarkable property, henceforth named prethermalization: Initially, the perturbed system closely follows the unperturbed relaxation towards a non-thermal steady state, but on even much larger time scales, there must be a clearly visible transition to the ultimate thermal behavior.
Originally, the term prethermalization was introduced by Berges, Borsányi, and Wetterich ber04 for matter under extreme conditions in a quasi-steady state far from equilibrium, which nevertheless exhibits some genuine thermal properties, however without any reference to the concept of integrability. Our present, somewhat different notion of prethermalization has been independently established by Moeckel and Kehrein in Ref. moe08 . During recent years, these and further slightly differing guises of prethermalization have been explored in numerous theoretical div1 ; div2a ; div2b ; ber16 as well as experimental div3 investigations, see also the recent reviews lan16 ; mor18 and further references therein.
Incidentally, the particular examples in Moeckel and Kehrein’s original work moe08 and also in some subsequent studies div2a are beyond the above mentioned realm of our present approach: If the unperturbed system is initially at thermal equilibrium or in the energy ground state, as it is the case in moe08 ; div2a , then the unperturbed dynamics is trivial, and also the signatures of prethermalization after adding a weak perturbation remain too small for our purposes.
Against our treating the perturbations as random matrices (in the unperturbed energy basis), one might object that the “true” perturbation in any concrete physical model is not a random matrix. In particular, the true matrix is often banded deu91 ; fei89 ; fyo96 ; gen12 , i.e., the typical magnitude of its entries decreases with increasing distance from the matrix diagonal. Furthermore, for non-interacting systems perturbed by few-body interactions, the matrix will be very sparse, i.e., only a small fraction of its entries is non-zero bor16 ; bro81 ; fyo96 ; fla97 .
To overcome these concerns, we will consider ensembles of random matrices which can be tailored to emulate the basic features of many concrete models, such as sparsity, bandedness, and other statistical characteristics bor16 ; deu91 ; fei89 ; fyo96 ; gen12 ; bro81 ; fla97 . The true perturbation is thus expected to be contained as one specific matrix in such a properly tailored ensemble as well. (For simplicity, one may imagine matrices of large but finite dimension, whose entries can assume only a finite number of different possible values, as it is the case in any numerical investigation. If each possible value has non-zero probability, there is a finite chance to sample the true matrix from the ensemble.) Hence, if one could prove that some property applies to all members of the ensemble, the property would also apply to the true model. Our main result consists in a slightly weaker statement, namely that the property “prethermalization” at least applies with overwhelming probability when randomly sampling perturbations from the ensemble (“typicality of prethermalization”). It is therefore still very reasonable to expect that the true model is not one of the extremely unlikely exceptions. An illustrative example (spin chain model) is provided in the Supplemental Material sup . Analogous arguments are routinely adopted in random matrix theory, which is well known to be extremely successful in practice bor16 ; bro81 , though its applicability has to our knowledge not been rigorously justified in any concrete physical example. Similar considerations also apply to many other “non-systematic” but practically very well established approximations, such as density functional theory or Boltzmann equations beyond the validity limits of their derivation.
We will demonstrate typicality of prethermalization for a great variety of different ensembles. The resulting total set of all admitted perturbations is therefore extremely large. This seems to us a quite noteworthy finding in itself, independent of the question whether some particular model is covered or not. Moreover, to actually exclude some particular model, it would have to be untypical with respect to every one of those various ensembles. We finally remark that most applications of random matrix theory focus on the ensemble-averaged behavior and take for granted that most individual matrices behave very similarly to the average bor16 ; bro81 . In our present approach, no such extra assumption will be needed.
The unperturbed system is described by a Hamiltonian with eigenvalues and eigenvectors . The unperturbed evolution of an arbitrary initial state can thus be written as and the expectation value of any given observable as
[TABLE]
where, depending on the specific system under consideration, the indices and run from to infinity or to some finite upper limit.
Likewise, the perturbed system
[TABLE]
exhibits eigenvalues and eigenvectors . Focusing on the same initial state as before, the expectation value under the perturbed dynamics is then given by the same formulas as in (1) and (2), except that all indices “[math]” must be omitted. In terms of the unitary basis transformation matrix
[TABLE]
this expectation value can be further rewritten as
[TABLE]
The quantity of foremost interest is the difference
[TABLE]
between the perturbed and the unperturbed expectation values. Taking into account (see above (1)), it follows with (1) and (5) that
[TABLE]
where is the Kronecker delta.
Finally, instead of one particular perturbation in (3), we consider a statistical ensemble of different ’s, and we indicate averages over the ensemble by an overline. This randomization of is inherited by the Hamiltonian in (3) and thus by the eigenvalues , the eigenvectors , the in (4), and the in (8). On the other hand, , , and are considered as arbitrary but fixed (non-random), hence the same must apply to , , , and to the matrix elements in (2).
The first main result of our Letter consists in the general rigorous bound
[TABLE]
where is the measurement range of (largest minus smallest eigenvalue). The quite tedious derivation has been relegated to the Supplemental Material sup .
Applying Markov’s inequality to (9), it follows for any that
[TABLE]
where the left hand side denotes the probability that when randomly sampling perturbations . For sufficiently small , the difference in (6) will thus be negligible for the vast majority of all ’s.
Our first assumption regarding the so far arbitrary ensemble of ’s is as follows: Multiplying the unperturbed energy eigenvectors by arbitrary factors leaves the ensemble invariant. Hence also the statistical properties of in (18) remain unchanged if all the matrix elements in (4) are multiplied by arbitrary factors . As a consequence (see also sup ), the ensemble average of (8) must vanish unless ,
[TABLE]
To justify this assumption we note that randomly flipping the signs of the leaves all physical properties unchanged but randomizes the signs of the true perturbation matrix elements . Hence, it is appropriate to adopt a random matrix model with the above invariance property.
As stated in the introduction, the unperturbed system is assumed to exhibit equilibration but not thermalization. Implicitly, this requires a macroscopically well defined system energy; i.e., there must exist a microcanonical energy interval so that only energies exhibit non-negligible level populations . The number of energies contained in is denoted by and, without loss of generality, we assume that for all those ’s. Furthermore, whenever , we adopt the idealization that is strictly zero f1 . The Cauchy-Schwarz inequality then implies that in (1) only summands with actually contribute. As usual, we take for granted that is huge (exponentially large in the system’s degrees of freedom gol10a ), while the local level density remains close to throughout the interval .
Given that only indices actually matter in (1), we can and will assume that their range is extended to arbitrary integer values and that the energies and the matrix elements are (re-)defined for arbitrary integers by way of “extrapolating” in a physically natural way their properties for .
As a first example, we consider the particularly simple case that for all , and that the statistical properties of the matrix elements do not depend separately on and , but only on the difference . As a consequence (see also sup ), the statistical properties of (16) remain invariant when simultaneously adding an arbitrary integer to all indices on the right hand side (but not on the left hand side). Upon averaging, one can thus infer that , hence
[TABLE]
is a well-defined (-independent) function.
Under the additional assumption that all statistical properties of the diagonal matrix elements are identical to those of , one can finally show sup that
[TABLE]
To justify this assumption we note that the diagonal elements of the true in (3) can always be re-adjusted to vanish on the average. A symmetrization procedure for the remaining distribution will be provided later.
Introducing (15)-(18) into (11)-(13), and taking into account that yields , , and , hence (10) takes the form
[TABLE]
One readily infers from (4), (16), and (17) that and that for all . Furthermore, it is convenient to rewrite (16) as
[TABLE]
The quantity plays a key role in random matrix theory under the name strength function or local spectral density of states fyo96 ; bor16 . Specifically, one finds that the ensemble average is very well approximated by the Breit-Wigner distribution
[TABLE]
under conditions which, together with the concomitant definition of , will be discussed in more detail shortly. Introducing this result into (20) yields , and with (17), (19) we obtain
[TABLE]
Eqs. (14) and (23) represent our main results. In the remainder of the Letter we focus – as usual in random matrix theory fyo96 ; bor16 ; bro81 – on the case that all with are statistically independent of each other (those with follow from ), that the statistics only depends on (see above (17)), and that and are equally likely (see above Eqs. (15), (18) and sup ).
If all are furthermore real and Gaussian distributed with variance , the result (22) with
[TABLE]
was obtained by Deutsch deu91 .e been worked out by Fyodorov et al. in Refs. fyo96 , including distributions with a pronounced delta peak at zero, corresponding to sparse random matrices . In addition, they also admitted the possibility of banded matrices f5 . We have further extended their analytical supersymmetry approach, and moreover performed extensive numerical explorations, showing that the key results (22)-(24) remain valid also for complex ’s and under still considerably weaker assumptions regarding their statistics. A few illustrative examples are provided in the Supplemental Material sup .
Multiplying in (3) by an extra factor (coupling strength) entails a factor in (24), hence the characteristic time scale in (23) decreases as , in agreement with previous findings for the persistence of the prethermalized state mor18 ; ber16 . However, we note that our inequality (9) admits strictly speaking no conclusions regarding the actual appearance of non-small differences in (6).
In order to abandon the requirement of equally spaced energies (see above Eq. (17)), let us consider an unperturbed Hamiltonian with the same eigenvectors as the original , but with modified energies . In view of (1) one anticipates that the corresponding expectation value still remains close to for sufficiently small and not too large . Indeed, it can be rigorously shown sup that
[TABLE]
Taking for granted that the unperturbed Hamiltonian exhibits equilibration but not thermalization (see beginning of the Letter), we denote by its relaxation time; i.e., remains very close to some (non-thermal) equilibrium value for (almost) all . It follows with (25) that also exhibits practically the same initial relaxation behavior and then remains close to for quite some time, provided for all . Recalling that the energy level density is exponentially large in the degrees of freedom gol10a , these conclusions must actually apply to rather general non-equidistant energies .
Returning to our perturbed systems of the form (3), where the considered ensemble of ’s satisfies the rather weak assumptions mentioned above, we can thus conclude from (6), (14), (23), and (24) that also the perturbed expectation values exhibit an initial relaxation and then remain close to for quite some time f3 , at least for the vast majority of perturbations , and provided they are sufficiently weak so that
[TABLE]
On the other hand, ultimate thermalization for most such ’s in (3) has been established in Refs. deu91 ; rei15b . Recalling the considerations at the beginning of our paper, all those “typical” ’s thus exhibit prethermalization.
Finally, upon defining modified perturbations via , we can conclude with Eq. (3) that . Hence, the vast majority of those perturbations of must, again, entail prethermalization. In doing so, the ’s are often expected to be so small that the resulting ensemble of ’s is almost identical to the original ensemble of ’s (see below). More generally, since the modified energies need no longer be ordered by magnitude, even substantially more general ensembles of ’s than of ’s are actually admitted, see also sup . Along similar lines, also a possible asymmetry of the distribution can be removed, as announced below (18).
Next we turn to the question of how far perturbations which satisfy (26) are “weak” in some physically meaningful sense. Quite obviously, such considerations are only possible in terms of non-rigorous arguments and rough estimates.
First of all, typicality of thermalization, as invoked below (26), trivially fails for vanishing perturbations and hence may possibly still fail for extremely weak perturbations ale16 ; bra15 . Yet, a closer inspection of the non-perturbative approach from Refs. deu91 ; rei15b suggests f4 that typicality of thermalization generally does apply provided (cf. Eq. (24)) and thus
[TABLE]
In particular, the diagonal matrix elements are then typically much larger than the level spacing , thus corroborating the claim below (26) that the ensembles of ’s and of ’s are often quite similar.
Second, while the unperturbed system is assumed not to thermalize for the given initial condition , one still expects that it exhibits the usual thermodynamic properties when the system state happens to be the microcanonical ensemble corresponding to the energy window introduced below Eq. (16). Denoting by the number of energy levels below , by Boltzmann’s constant, and by the entropy, the temperature is thus given by . Moreover, must not exceed , otherwise the level density would no longer be (approximately) constant throughout (as assumed below Eq. (16)). It now seems reasonable to say that a perturbation is weak if it does not notably change the thermal equilibrium properties (, , heat capacity, state of matter, etc.) of the unperturbed system. Closer inspection of the approach from Refs. deu91 ; rei15b implies that the perturbations are weak in this sense as long as (otherwise, regions with different level densities start to “interact” via the perturbation). According to (24) this amounts to . Focusing on the special (largest possible) choice and exploiting , we arrive at and . The first relation complements the lower bound from (27). Since is exponentially large in the degrees of freedom, the range of admitted values is thus still very large. The second relation agrees with (26) if is comparable to . As shown in Ref. bal17 , this is indeed the case for a quite large class of Hamiltonians , observables , and initial conditions .
Alternatively, a perturbation may be considered as weak if the perturbed expectation value remains for (almost) all sufficiently large times close to the expectation value , which the unperturbed system *would * assume in thermal equilibrium. By similar arguments as above, one can see that this alternative weak perturbation criterion is essentially equivalent to the one from the previous paragraph *and * the condition (27).
Altogether, Eq. (26) thus seems to be a physically very natural weak perturbation condition, and it appears reasonable to conjecture that prethermalization will in general be ruled out if (26) is violated. We plan to further pursue this issue in our future work.
In summary, prethermalization has been established for a very large class of integrable (non-thermalizing) Hamiltonians and weak perturbations , which closely imitate the essential features of many particular examples of interest in this context. Adopting the common lore of random matrix theory fyo96 ; bor16 ; bro81 ; deu91 , the same conclusion is thus expected to apply “typically” or “with overwhelming likelihood” also to any given such example, unless there is some a priori reason (inappropriate choice of the ensemble, another non-thermalizing system very near-by etc.) why the specific example at hand must be one of the very rare exceptions with respect to every admitted ensemble gol10a ; gol10b ; rei15b ; rei15 ; bal17 . Remarkably, the same predictions also apply to any other which exhibits equilibration but not thermalization, for instance due to many-body localization effects gog16 ; nan15 ; gol17 .
Acknowledgements.
This work was supported by the Deutsche Forschungsgemeinschaft (DFG) under Grant No. RE 1344/10-1 and within the Research Unit FOR 2692 under Grant No. 397303734.
I Derivation of Eqs. (m9)-(m13)
In this section, Eqs. (m9)-(m13) will be derived. Moreover, it will be shown that and , hence both roots on the right hand side of (m10) are non-negative real numbers.
To begin with, we recall the definitions in Eqs. (m1)-(m8). We also recall that the average over the random perturbations in Eq. (m3) is indicated by an overline. Accordingly, , , and are considered as arbitrary but fixed (non-random), hence the same follows for , , , , and . In contrast, , , , , and are random quantities.
Next, we rewrite (m7) as
[TABLE]
where, for notational convenience, all arguments “” and indices “[math]” are temporarily omitted. Note that all factors on the right hand side of (3) and hence on the left hand side are non-random quantities. In contrast, in (2) is a random quantity due to the random ’s on the right hand side. Moreover, its average gives rise to a correlation-type quantity on the right hand side of (2). Roughly speaking, the main idea behind (1) is to later split things into such a correlation-type part and a “rest”.
Denoting by and the largest and smallest eigenvalues of , and considering that the possible outcomes of a measurement are the eigenvalues of the observable , it follows that
[TABLE]
represents the measurement range of . Since all possible outcomes of any realistic measurement must be finite, we can take for granted that , , and are finite. Our next observation is that we can add an arbitrary constant to without changing the main quantity of interest, namely the difference in (m6). Without loss of generality we thus can and will assume that , and hence
[TABLE]
where denotes the operator norm of (largest eigenvalue in modulus). While on the left hand side of (1) thus remains unchanged when adding some constant to , the same does not apply separately to the two terms on the right hand side, as can be seen by closer inspection of (2) and (3). Hence, while our special choice in (5) is irrelevant with respect to , it does matter (and actually is optimized) with respect to the splitting of into and .
We recall that the in (2) are – according to Eq. (m2) – the matrix elements of in the unperturbed basis , and likewise for . In the same vein, in (m8) may be viewed as the matrix elements of an operator (put differently, is uniquely defined via those matrix elements). Note that – in view of (m8) – this operator may in general be non-Hermitian. Likewise, the Kronecker delta in (3) may be viewed as the matrix elements of the identity operator. As a consequence, one can rewrite (2) and (3) as
[TABLE]
I.1 Evaluation of
In order to further evaluate in (6), we define the auxiliary (non-Hermitian, random) operator
[TABLE]
Replacing in (6) by , a straightforward calculation yields
[TABLE]
By rewriting the right hand side of (10) as , observing that is a non-negative Hermitian operator, and evaluating the trace by means of the eigenbasis of , one finds that . With (5) we thus arrive at
[TABLE]
where
[TABLE]
is a real-valued, non-negative random variable.
Turning to in (11), we note that is a non-negative Hermitian operator, hence there exists a Hermitian operator, which we denote as , with the property that . Exploiting the cyclic invariance of the trace, the right hand side of (11) can thus be rewritten as . Considering as a scalar product between two (not necessarily Hermitian) operators and , the Cauchy-Schwarz inequality takes the form . Choosing and we can thus infer from (11) that
[TABLE]
Exploiting the cyclic invariance of the trace once more, the second trace in (14) can be identified with in (13). Likewise, the first trace can be identified with . Similarly as above (12), the latter trace can be upper bounded by , where
[TABLE]
Together with (5), we thus can conclude that
[TABLE]
Next, we introduce from (8) into (13) to obtain
[TABLE]
The last step follows along similar lines as above (6). It follows that
[TABLE]
From (m8), we can infer that
[TABLE]
hence (20) can be rewritten as
[TABLE]
Taking into account the definition (m4), it follows that , hence (22) takes the form
[TABLE]
Again, (m4) implies that the last sum equals , hence , and (19) yields
[TABLE]
Technically speaking, the possibility to exactly evaluate the products of four -matrix elements appearing in (19) via (21) is one of the key points of our present approach. In particular, this is the only place where such products (without averaging over the ensemble) actually appear.
Introducing (25) into (17) and averaging on both sides implies
[TABLE]
Upon comparison with (15), we can conclude that
[TABLE]
Since both in (13) and in (15) must be non-negative real numbers, it follows that
[TABLE]
Observing that (9) implies , we thus can infer from (12), (16), and (28) that
[TABLE]
I.2 Evaluation of
By means of the definition
[TABLE]
one readily verifies that
[TABLE]
and with (7) that
[TABLE]
Rewriting (30) as
[TABLE]
it follows, similarly as above (14), that
[TABLE]
and hence
[TABLE]
Similarly as above (15), the first factor on the right hand side of (35) can be upper bounded by , yielding with (5) the result
[TABLE]
From (32) we can conclude that and hence
[TABLE]
Next, we rewrite (36) as
[TABLE]
In the second step in (39), we exploited , , and Eq. (15). It follows that , where
[TABLE]
and hence with (27) and (38) that
[TABLE]
We finally remark that is a real and non-negative number according to (36). Due to (28) it follows that also is a real number and satisfies .
I.3 Evaluation of
Eq. (1) implies and hence . Recalling that is a non-random quantity (see below (3)) yields . With (29) and (42) it then follows that
[TABLE]
Note that since is a real-valued, non-negative random variable (see below (13)), the same applies to . Observing that is a concave function for all , we can exploit Jensen’s inequality to obtain . Moreover, we can conclude from (28) that . Altogether, (43) thus implies
[TABLE]
Finally, we can recast from (15) by means of similar arguments as above (6) into the form
[TABLE]
Likewise, from (40) takes the form
[TABLE]
Reintroducing the omitted indices “[math]” and arguments “” (see below (3)), we recover from (27), (40), (44)-(46) our final result
[TABLE]
These results are identical to Eqs. (m9)-(m13). We also recall that and are known to be real numbers with (see (28)) and (see below (42) and above (44)). Altogether, we thus recover the findings announced at the beginning of this section.
II Random matrix description of a spin chain model
In this section, we illustrate the approximation of a concrete physical model system by the random matrix approach from the main paper. As a particularly simple and common example we consider an (integrable) Heisenberg spin-1/2 chain in the presence of a weak integrability breaking perturbation.
Similarly as in the main paper, the perturbed Hamiltonian is written in two alternative forms,
[TABLE]
where and are the “bare” operators, while and are their “dressed” counterparts. The detailed definition of those operators will be provided below, and the essential idea behind those definitions will be discussed in Sec. II.3.
II.1 The model
As announced, we focus on a perturbed Heisenberg spin-1/2 chain with “bare” operators (cf. Eq. (52))
[TABLE]
where (with and ) are Pauli matrices acting on site and . The unperturbed Hamiltonian (53) amounts to spins with nearest neighbor interactions and is well known to be integrable, while the perturbation (54) consists of integrability breaking next-to-nearest neighbor interactions. The units are chosen so that
[TABLE]
and the (in these units) dimensionless parameter quantifies the perturbation strength.
Following the main paper, the eigenvalues and eigenvectors of in (53) are denoted as and , and the matrix elements of the perturbation (54) are abbreviated as . For a spin chain of length and perturbation strength , the unperturbed eigenvalues and the perturbation matrix are illustrated in Fig. 1. In this example, the indices have been chosen so that the ’s are ordered by magnitude, where and is the Hilbert space dimension of the model (52). Closer inspections shows that it is always possible to choose the eigenvectors so that all matrix elements are real numbers. Without loss of generality, the eigenvectors have been chosen in this way in all numerical results of this section.
Fig. 1 depicts the so obtained matrix , the diagonal matrix elements , and the energy levels in the first, second, and third rows, respectively. In each case, we show a view of the full Hilbert space in the left panels and a close-up of the middle of the spectrum in the right panels.
The coarse structure in Fig. 1(a) already appears to be qualitatively “random”. Looking at the individual matrix elements in detail in Fig. 1(b), however, reveals correlations that manifest themselves very roughly speaking as blocks of vanishing and non-vanishing entries. Comparing off-diagonal and diagonal matrix elements, we observe that the former are typically one to two orders of magnitude smaller than the latter. For the rest, at least in the middle of the spectrum shown in Fig. 1(d), the diagonal matrix elements essentially look like random numbers.
In Fig. 1(e) and (f), we display the distribution of energy levels by means of the function from the main paper, which counts the number of ’s with the property . Put differently,
[TABLE]
where is the Heaviside step function. The example in Fig. 1(e) nicely illustrates a property, which is generic for many-body systems, and which is taken for granted in the main paper, namely that the energy spectrum gives rise to a well-defined local level density, which may change notably only on scales much larger than the corresponding mean level spacing. (Otherwise, the Boltzmann entropy would not lead to reasonable thermodynamic properties of the system, see end of the main paper.) Here and in the following, we focus on the middle of the spectrum, where the level density is almost constant over a particularly large energy interval, but similar results would be obtained for any other energy interval which is not too large and not too close to the upper and lower ends of the spectrum. (With increasing , those restrictions are expected to become weaker and weaker).
We numerically explored the above model along the same lines as for the dressed model in Sec. II.2 and found that it can not be satisfactorily approximated by our present random matrix approach. The main reason seems to be (see also Sec. III.1 below) that the diagonal matrix elements in Fig. 1(c) exhibit rather large fluctuations (upon variation of ), which are not reflected in the considered random matrix model fyo96 .
II.2 Dressed operators
The main purpose of the “dressed” operators and in (52) is to overcome the above mentioned mismatch between the bare operators and our present random matrix approach. To this end, we first combine the unperturbed levels with the diagonal matrix elements , defining
[TABLE]
Very roughly speaking, the idea behind this definition and the following considerations is to effectively absorb into the unperturbed Hamiltonian those parts of the perturbation that do not break integrability, thus isolating the integrability-breaking non-diagonal contributions.
Given that the fluctuations of the ’s are still much smaller than the energy scale over which the level density exhibits notable variations, and that those fluctuations are essentially unbiased random numbers (see above), one expects that also the ’s in (57) exhibit some well-defined local density, which is moreover practically equal to the local density of the ’s. On the other hand, since the fluctuations of the are significantly larger than the mean level spacing , the so-defined ’s are no longer ordered by magnitude. To restore this order, we relabel the eigenvectors so that for all . In all that follows, this modified ordering of the labels is implicitly understood. (Obviously, the physical properties of the problem at hand do not depend on how we order the ’s.)
As in the main paper, we finally require that in (52) should exhibit the same eigenvectors as , whereas the eigenvalues must be equally spaced, with a level spacing which reproduces the mean level spacing of the ’s (or, equivalently, the ’s) within the microcanonical energy window of actual interest. Recalling that denotes the (large but finite) number of energies within , it follows that . Accordingly, we set
[TABLE]
with a still arbitrary constant . In order to satisfy the second identity in Eq. (52), the diagonal matrix elements of the dressed perturbation must thus be defined as
[TABLE]
while the off-diagonal matrix elements must remain unchanged, , apart from the above mentioned reordering of the labels .
By way of their above construction, we expect that the dressed operators and now meet the characteristics of generic random matrix ensembles such as those from Ref. fyo96 . This is verified numerically in Fig. 2 for the same model and data as in Fig. 1, but presented in terms of the dressed matrix instead of the bare . (The distribution of energies as defined in Eq. (58) is rather boring to look at and therefore not plotted.) As the microcanonical energy window, we selected the central of states, and the constant in Eq. (58) was chosen such that the average of the ’s vanishes within this window.
While the coarse view in Fig. 2(a) looks similar to the one from Fig. 1(a), the close-up in Fig. 2(b) shows that also the fine structure of the perturbation now indeed looks very much like a “true” (possibly sparse) random matrix. The diagonal elements are separately reproduced in Fig. 2(c) and (d). In particular, Fig. 2(d) confirms that the local fluctuations are now much smaller than those of in Fig. 1(e).
Similarly as when comparing “true” random numbers with numerically generated pseudo-random numbers, it is in general very difficult to quantitatively determine how closely the ’s in Fig. 2 emulate a “true” random matrix. We therefore focus on the specific quantity which is at the heart of our present work, namely the (shifted) local density of states (see also (m21))
[TABLE]
where and denote (as in the main paper) the eigenvalues and eigenvectors of the perturbed system in (52).
As detailed in the main paper, the ensemble average of is theoretically predicted to be given by
[TABLE]
where is the ensemble averaged variance of the ’s (with and not too large). In our numerics, this prediction is compared with
[TABLE]
where represents the set of indices with and is their number. Furthermore, in (64) is now the numerically determined variance of the ’s. Finally, the delta function appearing in (62) is replaced by a pre-delta function
[TABLE]
essentially amounting to a binning of the numerical data with bin width .
Estimating the ensemble average of from (60) by the column average of a single realization in Eq. (64) is common practice in random matrix theory (sometimes referred to as self-averaging or ergodicity property), and is justified by the large number of states within the microcanonical energy window and the translational invariance of the ensemble of perturbations, meaning that and exhibit similar statistical properties for all and thus may be viewed as different samples of the random matrix ensemble. In particular, the column average in Eq. (64) is thus expected to converge for large towards the ensemble average of .
While the theoretical predictions strictly speaking apply to infinitely large matrices, the numerically considered matrices are of large but finite dimension . In order to minimize finite size artifacts, which are known to be particularly pronounced near the borders of the matrices, we choose the interval to comprise the central of states as before, such that .
In Fig. 3, we compare the so obtained numerical results for the function with the theory from Eqs. (62) and (63) for three different perturbation strengths in (54). We emphasize that there are no fit parameters: The width of the Breit-Wigner distribution (62) is obtained from the values of (see figure) and computed from the actual matrix elements and energy levels . The plots reveal good agreement between numerics and theory, even for the moderate perturbation strength , where .
II.3 Concluding remarks
Finally, it may be worthwhile to recall the following key ideas from the main paper regarding the connection of the above considerations with the main objectives of the paper: In general, the two alternative unperturbed Hamiltonians and in (52) give rise to different time-dependent expectation values. However, the difference of those expectation values can be bounded with the help of our inequality (m25). Closer inspection of how the right hand sides of (m24) and of (m25) numerically scale with the number of spins in (53) indicates that for sufficiently large the left hand side of (m25) in fact becomes arbitrarily small (compared to ) for all times with . Since we are mainly interested in large and in our present work, the two unperturbed Hamiltonians and thus give rise to expectation values which can be considered as practically indistinguishable. Therefore it is justified to work with the dressed perturbation in (52) instead of the bare in our above considerations.
Besides those times with , also the long-time limit plays an essential role. Specifically, the long-time average of (m1) takes the form
[TABLE]
where indicates the long-time average of , and where we exploited that the spectrum of is non-degenerate (see (58)). In particular, this long-time average refers to the dynamics governed by . But since the eigenvectors of and are identical, the above result must also be equal to the long-time average when the dynamics is governed by . (In contrast to , the spectrum of may exhibit degeneracies; in such a case, the eigenvectors are tacitly chosen so that is diagonal within every energy eigenspace.) One thus can conclude that also with respect to the long-time properties it is justified to work with the dressed perturbation in (52) instead of the bare .
III Numerical illustration of generalized
random matrix ensembles
In this section we provide three illustrative numerical examples, indicating that the analytical predictions from Ref. fyo96 remain valid for considerably more general random matrix ensembles than those actually admitted therein (see also main paper below (m24)). In particular, while in random matrix theory it is usually assumed that all matrix elements of some given ensemble are statistically independent of each other, we will demonstrate that many results, e.g. Eqs. (62) and (63), are still applicable in the presence of various, quite notable correlations. Since such correlations naturally arise in real systems (see also Sec. II), this is an important observation with regard to their modeling by random matrices.
The general framework and the notation is as in the previous section (and in the main paper). In particular, we again compare the analytical prediction (62) with the numerically obtained counterpart (64), i.e., we will numerically estimate the ensemble average of from (60) by the column average of a single realization in Eq. (64). While the theory strictly speaking applies to infinitely large matrices, our numerical examples will be of large but finite dimension . In order to minimize the corresponding finite size artifacts, which are known to be particularly pronounced near the borders of the matrices, the interval in (64) is chosen so that it remains sufficiently far away from the lower and upper limits at and , respectively. As before, the delta function, which enters into (64) via (60), is approximated by the pre-delta function (67), effectively amounting to a binning of the numerical data.
III.1 First example
As our first example, we again start out from the spin model (53)–(54) with , hence . But now, we modify the actual perturbation by manually setting all diagonal elements equal to zero, or equivalently, we consider dressed operators which no longer satisfy (52) but rather are defined via
[TABLE]
While the resulting Hamiltonian is no longer expected to faithfully model some “natural” physical system, it still amounts to an instructive “artificial” example: On the one hand, the large fluctuations of the diagonal elements (see Fig. 1(c)) are eliminated. On the other hand, the remaining off-diagonal elements still exhibit a considerable amount of “structures” or “correlations” (see Figs. 1(b) and 4(a)). Such a perturbation matrix may thus be considered either as an extremely “unlikely/exceptional/untypical” member of one of the random matrix ensembles admitted in the analytical explorations from fyo96 , or as a “typical” member of an ensemble which is not admitted in those analytical explorations.
Focusing, as in the previous section, on a microcanonical energy window consisting of the central of states, we numerically evaluated from (64), the local density of states , as well as the sample variance of perturbation elements to estimate . Hence there are again no free parameters in the theory. Comparing to the Breit-Wigner formula (62) with width (63) in Fig. 5, we observe that the theory still describes the numerical estimate quite well. In particular, this substantiates the assertion at the end of Sec. II.1.
III.2 Further examples
In the following two examples, the total Hilbert space dimension is and the unperturbed system is chosen to exhibit Poissonian level statistics with a mean spacing , reflecting the distribution of gaps of typical integrable systems. (Equally spaced levels as in (60) or Wigner-Dyson distributed spacings would lead to practically the same results.) As before, the considered perturbations are quite artificial and are not meant to reflect actual physical systems. Rather, their purpose is to illustrate the generality of the result (62), (63).
In our first artificial ensemble of perturbations, only every th entry starting from the first minor diagonal is nonvanishing. Hence the resulting matrices are sparse with a very regular entry pattern. Moreover, the nonvanishing entries can only take values and , each with probability . An example of such a perturbation matrix is shown in Fig. 4(b). The resulting numerical estimate along with the theory prediction is shown in Fig. 6 for and two different choices of . As before, there are no fit parameters involved, since was fixed a priori and by construction. Even though the dimension of the considered matrices is rather small, at least vastly smaller than the size of the Hilbert space for typical many-body systems, theory and numerics agree very well.
For our second artificial example, we uniformly scatter circular islands of diameter across the perturbation matrix. Within each island, we alternate entries and in the radial direction. More precisely, if denotes the (randomly sampled) center of the th island and is the distance of entry from that center, then we define the contribution to from the th island as
[TABLE]
where ‘’ denotes the modulo operation and is the greatest integer such that (“floor function”). Islands may overlap, in which case the contributions are simply added, i.e.,
[TABLE]
Hence in this ensemble, the sparsity is irregular (random), but the distribution of nonvanishing entries shows a regular pattern. For a visualization of a single realization of this ensemble, see Fig. 4(c).
In Fig. 7, we display the numerical estimate (64) of the local density of states for a perturbation with islands of size and for two different perturbation strengths and . The resulting average sparsity is similar as in the first artificial example, i.e., approximately every fourth matrix element is nonvanishing. The parameter is computed as the sample variance of all perturbation matrix elements, so that as before there is no free parameter in the theory (62). Again, numerics and theory show good agreement.
We also explored several further examples of artificially “structured” or “correlated” perturbations along the same lines as for those depicted in Fig. 4; for instance, patterns generated by cellular automata or by concentric rings (“corner arcs”) around the upper left corner of the matrix, etc. We also investigated separately the impact of various “patterns” along the matrix diagonal. Usually, the results were similar to those in Figs. 5-7 and are therefore not reproduced here. Notably different results were only obtained for examples with extremely “little” randomness or extremely strong correlations, for instance, when choosing deterministically alternating signs along the off-diagonals in Fig. 4(b). (In this case, the only random ingredient is the Poissonian level statistics of the unperturbed spectrum.)
IV Derivation of Eqs. (m15)-(m18)
In this section, we deduce Eqs. (m15)-(m18) from the three assumptions above Eqs. (m15), (m17), and (m18).
In doing so, our principal tool is the Taylor expansion formula for functions of noncommuting operators derived in Ref. kum65 . It states that for two arbitrary linear (but not necessarily Hermitian) operators and any complex analytic function ,
[TABLE]
where is defined via the power series representation of , and where denotes the th derivative of . Moreover, the operators are given by the recurrence relation
[TABLE]
with the square brackets denoting the commutator.
A second key point is the observation that from (m8) can be rewritten in terms of the function as
[TABLE]
where is considered as arbitrary but fixed. Using the Taylor expansion formula (72) with and , and noting that the above defined is an analytical function with , we obtain
[TABLE]
Furthermore, (73) now takes the form
[TABLE]
More explicitly, the first few ’s are
[TABLE]
As exemplified by and , the operators are thus not necessarily Hermitian. Finally, the definitions (m8) and (m16) imply that , and with (75) we thus obtain
[TABLE]
IV.1 Derivation of Eqs. (m15) and (m16)
The goal of this subsection is to show that the assumption above Eq. (m15) implies
[TABLE]
whenever . From this result and Eq. (m8), one readily recovers Eqs. (m15) and (m16).
In order to verify (79), it is according to (75) sufficient to show that
[TABLE]
for all and any given . To this end, we take for granted the assumption above Eq. (m15). In particular, the two unperturbed basis vectors , appearing in the average may thus be multiplied by arbitrary factors without changing the value of that average.
Due to the trivial fact that the operators in (76), (77) are independent of the basis vectors , and thus of the factors , it follows that
[TABLE]
for arbitrary . If , then , hence (81) is always fulfilled. If , the factor may assume both values , hence (81) implies (80).
IV.2 Justification of Eq. (m17)
The objective of this subsection is to show that the function is independent of and hence from Eq. (m17) is well-defined, provided the assumption above Eq. (m17) is fulfilled.
The assumption above Eq. (m17) consist of two parts: The first part requires that for all and thus
[TABLE]
for all . The second part requires that all statistical properties of the matrix elements do not depend separately on and , but only on the difference . In particular, this implies that
[TABLE]
for arbitrary indices and any , where the right hand side indicates that the left hand side is given by some function which only depends on the differences .
In view of Eq. (78), the objective from the beginning of this subsection will be achieved if we can infer from (82) and (83) that is independent of for all . The latter will be shown in what follows.
From (76) or (77) it can be seen that the are composed of powers of and commutators of these powers with . Working in the eigenbasis of and inserting complete sets of states between factors of , we find that any diagonal matrix element can be written as a sum of terms of the general form
[TABLE]
with and with the property that
[TABLE]
The explicit expression of in terms of the functions (84) which satisfy (85) is given in Ref. kum65 , but does not matter for the following arguments.
From Eqs. (83) and (84) we can conclude that
[TABLE]
As usual, the value of the sum on the right hand side does not change when a summation index is shifted by an arbitrary integer. In particular, we thus may substitute each index by (). Exploiting (82) it follows that
[TABLE]
where the right hand side is manifestly independent of . Since all are sums of such terms (see above), it follows that they must be independent of .
IV.3 Derivation of Eq. (m18)
The goal of this subsection is to show that the assumptions above Eqs. (m15), (m17), and (m18) imply
[TABLE]
Since also (m17) is known to apply under those assumptions, Eq. (m18) then readily follows.
In passing we note that (m16) implies , and with Eq. (m17) it follows that . Together with (88), we thus obtain
[TABLE]
which is a quite interesting result in itself.
In order to verify (88), it is according to (78) sufficient to show that the averaged diagonal matrix elements are purely real for even and purely imaginary for odd . To this end, we recall that is given by a sum of terms of the form (84). Hence, in order to verify (88), it is sufficient to show that
[TABLE]
where is determined by and according to (85).
Taking for granted the assumption above (m17), it follows, as in the previous subsection, that Eqs. (82), (83), and (87) apply.
Without loss of generality we can and will choose the energy scale so that and hence according to (82). Together with (85) we thus can conclude that
[TABLE]
Furthermore, the assumptions above (m15) and (m18) guarantee that the statistical properties of are identical to those of for all , implying that the functions from (83) satisfy
[TABLE]
Since is Hermitian, , it follows from (83) that
[TABLE]
and with (92) that
[TABLE]
Similarly as below (86) one sees that all summation indices on the right hand side of Eq. (87) may be replaced by , yielding
[TABLE]
Upon introducing (91) and exploiting (94) with , it follows that
[TABLE]
Together with (87) we thus recover Eq. (90).
V Statistics of the random matrix elements
In this section, the statistical properties of the the random matrix elements are discussed in more detail, especially those mentioned in the main text below Eq. (m23).
To begin with, we observe that the random distribution of any given matrix element is captured by the probability density
[TABLE]
Here, it is understood that for random matrix ensembles with purely real elements , only real arguments are admitted in (97). On the other hand, for ensembles with complex (which is only possible for ), the arguments in (97) are understood to be complex for and real for . We also recall that
[TABLE]
for complex arguments on the right hand side of (97). It follows that
[TABLE]
both for real and complex arguments .
Next, the following two conditions
[TABLE]
will be deduced from the three conditions above Eqs. (m15), (m17), and (m18). These results (100), (101) readily imply the statements below (m23), namely that the statistics of any given matrix element only depends on , and that and are equally likely.
The assumption above Eq. (m15) (see also Sec. IV.1) together with the remarks below (97) readily imply
[TABLE]
for arbitrary . If , then , hence (102) is always fulfilled. If , the factor in (102) may assume both values . With (97) and (99) it thus follows that (102) is equivalent to
[TABLE]
for all . (Unlike in (100), the case is still missing.)
Similarly, from the condition above Eq. (m17) (see also Sec. IV.2) one can readily deduce (101) for all .
Finally, the condition above Eq. (m18) (see also Sec. IV.3) in combination with (97) and (99) implies
[TABLE]
for all . Hence, (104) together with (103) yields (100).
In conclusion, the three conditions on above Eqs. (m15), (m17), and (m18) imply the two conditions (100) and (101) on .
From now on we focus, as in the main text, on the simplest and most common case that all with are statistically independent of each other. More precisely, since , we tacitly assume that in the above and the following statements.
As a consequence, all statistical properties of the random matrix are (by definition) fully captured by the distributions of the single elements in (97). If those distributions satisfy (100) and (101), it is straightforward to invert the above line of reasoning with the final conclusion that the two conditions (100) and (101) imply the three conditions above Eqs. (m15), (m17), and (m18).
In other words, under the additional assumption that the matrix elements are statistically independent for all , the conditions (100) and (101) are actually equivalent to the three conditions above Eqs. (m15), (m17), and (m18).
VI Derivation of Eq. (m25)
Throughout this section, we adopt the same notation as in the main paper, except that all indices “[math]” are omitted for notational convenience.
Accordingly, we consider two arbitrary density operators and with identical initial conditions,
[TABLE]
but whose time evolution is governed by two different Hamiltonians, namely
[TABLE]
In other words, the eigenvectors of and must be identical, while the eigenvalues and may be different.
In case the dynamics is governed by , the expectation value of any given observable at time can be written, similarly as in Eq. (m1), in the form
[TABLE]
The right hand side of (110) is understood as usual:
[TABLE]
Likewise, if governs the dynamics, the expectation value of at time can be written as
[TABLE]
The last identity in (114) relies on the fact that from (106) and from (107) commute. Together with (108)-(111) it follows that and due to the cyclic invariance of the trace that
[TABLE]
where the dependence of on has been omitted for the sake of simplicity.
The main goal of this section is to show that
[TABLE]
for arbitrary and , where is the difference between the largest and smallest eigenvalues of . Eq. (m25) then follows upon taking into account: (i) As mentioned at the beginning of this section, all missing indices “[math]” have to be restored. (ii) As said below Eq. (m16), only actually count in (118).
In order to verify (118), we first evaluate the trace in (116) by means of the eigenbasis of , yielding
[TABLE]
where the maximization in (119) is over all normalized vectors .
For an arbitrary but fixed vector of unit norm we can rewrite (120) with (117) as
[TABLE]
With the definition
[TABLE]
we can conclude that
[TABLE]
Eqs. (121), (124), and (126) imply
[TABLE]
From the definition (125) and the Cauchy-Schwarz inequality it follows that
[TABLE]
Since we assumed that is normalized, also in (122) will be normalized and the last factor in (129) can be upper bounded by , where is the operator norm of (largest eigenvalue in modulus). Exactly the same upper bound can be obtained for in (127). With (128) we thus arrive at
[TABLE]
Obviously, in (116) remains unchanged when adding an arbitrary real constant to . Hence, the inequality (130) with instead of on the right hand side remains valid for arbitrary . The minimum over all is assumed when the largest and smallest eigenvalues of are of opposite sign and equal modulus, yielding
[TABLE]
where is the difference between the largest and smallest eigenvalues of .
Rewriting as with , the normalization of takes the form . Furthermore, we can infer from (106), (107), (115), and (122) that
[TABLE]
and from (123) that
[TABLE]
One readily verifies that for arbitrary , yielding
[TABLE]
By introducing (135) into (131) we can conclude
[TABLE]
Observing that this bound is independent of , and taking into account Eqs. (119) and (133), the announced final result (118) is recovered.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) C. Gogolin and J. Eisert, Equilibration, thermalisation, and the emergence of statistical mechanics in closed quantum systems , Rep. Prog. Phys. 79 , 056001 (2016).
- 2(2) L. D’Alessio, Y. Kafri, A. Polkovnikov, and M. Rigol, From quantum chaos and eigenstate thermalization to statistical mechanics and thermodynamics , Adv. Phys. 65 , 239 (2016).
- 3(3) F. Borgonovi, F. M. Izrailev, L. F. Santos, and V. G. Zelevinsky, Quantum chaos and thermalization in isolated systems of interacting particles , Phys. Rep. 626 , 1 (2016).
- 4(4) T. Mori, T. N. Ikeda, E. Kaminishi, and M. Ueda, Thermalization and prethermalization in isolated quantum systems: a theoretical overview , J. Phys. B 51 , 112001 (2018).
- 5(5) J. M. Deutsch, Quantum statistical mechanics in a closed system , Phys. Rev. A 43 , 2046 (1991); J. M. Deutsch, Thermodynamic entropy of a many-body energy eigenstate , New J. Phys. 12 , 075021 (2010); J. M. Deutsch, A closed quantum system giving ergodicity , deutsch.physics.ucsc.edu/pdf/quantumstat.pdf (unpublished).
- 6(6) J. Berges, Sz. Borsányi, and C. Wetterich, Prethermalization , Phys. Rev. Lett. 93 , 142002 (2004).
- 7(7) M. Moeckel and S. Kehrein, Interaction quench in the Hubbard model , Phys. Rev. Lett. 100 , 175702 (2008); M. Moeckel and S. Kehrein, Real-time evolution for weak interaction quenches in quantum systems , Ann. Phys. 324 , 2146 (2009); M. Moeckel and S. Kehrein, Crossover from adiabatic to sudden interaction quenches in the Hubbard model: prethermalization and non-equilibrium dynamics , New J. Phys. 12 , 055016 (2010).
- 8(8) A. Arrizabalaga, J. Smit, and A. Tranberg, Equilibration in ϕ 4 superscript italic-ϕ 4 \phi^{4} theory in 3+1 dimensions , Phys. Rev. D 72 , 025014 (2005); D. Podolsky, G. N. Felder, L. Kofman, and M. Peloso, Equation of state and beginning of thermalization after preheating , Phys. Rev. D. 73 , 023501 (2006); B. Nowak, D. Sexty, and T. Gasenzer, Superfluid turbulence: nonthermal fixed point in an ultracold Bose gas , Phys. Rev B 84 , 020506 (2011); B. Nowak, J. Schole, and T. Gas
