Asymptotic Reversibility of Thermal Operations for Interacting Quantum Spin Systems via Generalized Quantum Stein's Lemma
Takahiro Sagawa, Philippe Faist, Kohtaro Kato, Keiji Matsumoto,, Hiroshi Nagaoka, Fernando G. S. L. Brandao

TL;DR
This paper demonstrates that for translation-invariant ergodic quantum spin systems, the asymptotic convertibility of states under thermal operations is fully characterized by the KL divergence rate, extending quantum Stein's lemma.
Contribution
It generalizes quantum Stein's lemma to non-i.i.d. ergodic states, establishing KL divergence rate as a thermodynamic potential for quantum many-body systems.
Findings
KL divergence rate characterizes state convertibility
Extension of quantum Stein's lemma beyond i.i.d. cases
Reversible conversion of states with quantum coherence
Abstract
For quantum spin systems in any spatial dimension with a local, translation-invariant Hamiltonian, we prove that asymptotic state convertibility from a quantum state to another one by a thermodynamically feasible class of quantum dynamics, called thermal operations, is completely characterized by the Kullback-Leibler (KL) divergence rate, if the state is translation-invariant and spatially ergodic. Our proof consists of two parts and is phrased in terms of a branch of the quantum information theory called the resource theory. First, we prove that any states, for which the min and max R\'enyi divergences collapse approximately to a single value, can be approximately reversibly converted into one another by thermal operations with the aid of a small source of quantum coherence. Second, we prove that these divergences collapse asymptotically to the KL divergence rate for any…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Asymptotic Reversibility of Thermal Operations for Interacting Quantum Spin Systems via Generalized Quantum Stein’s Lemma
Takahiro Sagawa
Department of Applied Physics, The University of Tokyo, Tokyo 113-8656, Japan
Philippe Faist
Institute for Quantum Information and Matter, California Institute of Technology, Pasadena, CA 91125, USA
Institute for Theoretical Physics, ETH Zurich, 8093 Switzerland
Dahlem Center for Complex Quantum Systems, Freie Universität Berlin, 14195 Berlin, Germany
Kohtaro Kato
Institute for Quantum Information and Matter, California Institute of Technology, Pasadena, CA 91125, USA
Keiji Matsumoto
National Institute of Informatics, Tokyo 101-8430, Japan
Hiroshi Nagaoka
The University of Electro-Communications, Tokyo, 182-8585, Japan
Fernando G. S. L. Brandão
Institute for Quantum Information and Matter, California Institute of Technology, Pasadena, CA 91125, USA
Abstract
For quantum spin systems in any spatial dimension with a local, translation-invariant Hamiltonian, we prove that asymptotic state convertibility from a quantum state to another one by a thermodynamically feasible class of quantum dynamics, called thermal operations, is completely characterized by the Kullback-Leibler (KL) divergence rate, if the state is translation-invariant and spatially ergodic. Our proof consists of two parts and is phrased in terms of a branch of the quantum information theory called the resource theory. First, we prove that any states, for which the min and max Rényi divergences collapse approximately to a single value, can be approximately reversibly converted into one another by thermal operations with the aid of a small source of quantum coherence. Second, we prove that these divergences collapse asymptotically to the KL divergence rate for any translation-invariant ergodic state. We show this via a generalization of the quantum Stein’s lemma for quantum hypothesis testing beyond independent and identically distributed (i.i.d.) situations. Our result implies that the KL divergence rate serves as a thermodynamic potential that provides a complete characterization of thermodynamic convertibility of ergodic states of quantum many-body systems in the thermodynamic limit, including out-of-equilibrium and fully quantum situations.
Contents
-
III.3.3 Collapse of the min and max divergences suppresses coherence
-
IV Collapse of the min and max divergences for ergodic states relative to local Gibbs states
-
IV.3 Generalized Stein’s lemma for ergodic states relative to local Gibbs states
-
IV.4.3 The role of the KL divergence for the thermodynamic potential
-
0.B Properties of our thermodynamic framework and convertibility proof for Gibbs-preserving maps
I Introduction
Reversibility and irreversibility of dynamics in classical and quantum physics, especially in thermodynamics, is characterized thanks to the concept of entropy. It is a salient feature of macroscopic equilibrium thermodynamics that entropy does not only have the non-decreasing property but also provides a complete characterization of convertibility between thermal equilibrium states Callen (1985), which is represented by the second law of thermodynamics. Lieb and Yngvason constructed an axiomatic formulation of this phenomenology, and within their mathematical framework, rigorously proved that entropy provides a necessary and sufficient condition for state conversion, and furthermore, that such an entropy function is essentially unique Lieb and Yngvason (1999).
The connection between microscopic information entropy and thermodynamic entropy has been extensively studied both in terms of statistical mechanics Sagawa (2012); Parrondo et al. (2015) and the thermodynamic resource theory Goold et al. (2016); Chitambar and Gour (2019); Sagawa (2021). In the latter formalism which we adopt in this article, so-called one-shot entropy measures have provided tools to quantify resource costs of physical operations in quantum information settings including quantum thermodynamics Goold et al. (2016); Chitambar and Gour (2019); Brandão et al. (2013); Åberg (2013); Horodecki and Oppenheim (2013); Brandão et al. (2015); Gour et al. (2018); Faist et al. (2015a); Faist and Renner (2018); Weilenmann et al. (2016); Weilenmann (2017); Weilenmann et al. (2018); Sagawa (2021).
Our understanding of the macroscopic behavior of the entropy has been sharpened by fundamental theorems proving asymptotic equipartition properties (AEP). Rougly speaking, an AEP states that in the long sequence limit of a stochastic process, some relevant quantities concentrate to definite values. For instance, the Shannon-McMillan theorem states that an ergodic process satisfies an AEP with the Shannon entropy rate Shannon (1948); Cover and Thomas (2006). This has been generalized to a stronger form known as the Shannon-McMillan-Breiman theorem as well as to a relative version for an ergodic process with respect to a Markov process Algoet and Cover (1988). A quantum version of the Shannon-McMillan theorem proves a similar AEP for quantum ergodic processes with the von Neumann entropy rate Bjelaković et al. (2004); Bjelaković and Szkola (2005); Ogata (2013).
Closely related to AEP theorems is Stein’s lemma, which relates the asymptotic error rate of hypothesis testing for distinguishing two quantum states to the KL divergence rate. Classically, Stein’s lemma is a straightforward consequence of the relative AEP. However, its quantum counterpart is more involved Hiai and Petz (1991); Nagaoka and Ogawa (2000); Bjelakovic and Siegmund-Schultze (2004); Brandão and Plenio (2010); Bjelakovic and Siegmund-Schultze (2003). Hiai and Petz Hiai and Petz (1991) first addressed the quantum Stein’s lemma and provided a partial proof for a completely ergodic quantum state with respect to an i.i.d. state. The proof of the quantum Stein’s lemma was completed for the case where both states are i.i.d. by Ogawa and Nagaoka Nagaoka and Ogawa (2000), by proving the strong converse of the Hiai-Petz theorem for that case. A more general form of the quantum Stein’s lemma for an ergodic state with respect to an i.i.d. state was proved in Ref. Bjelakovic and Siegmund-Schultze (2004), which is regarded as a quantum analog of the relative AEP.
In this work, we go beyond the non-interacting or i.i.d. regime, and investigate an entropy function that provides a thermodynamic characterization of physically relevant, interacting many-body quantum systems. We consider quantum spin systems on the lattice with an arbitrary number of spatial dimensions. Under certain general conditions, we rigorously prove that the necessary and sufficient condition for asymptotic state conversion from one ergodic state to another state by thermodynamically feasible quantum dynamics, called thermal operations Brandão et al. (2013), is characterized by the Kullback-Leibler (KL) divergence rate of the state relative to the Gibbs state. The KL divergence rate is shown to determine the work cost for state transformations, and thus plays a role of the proper thermodynamic potential. Our central assumptions are that (i) the quantum state is translation-invariant and spatially ergodic and (ii) the Hamiltonian is translation-invariant and local. Physically, the assumption (i) implies that a quantum state does not exhibit any macroscopic fluctuations if one looks at translation-invariant observables Cover and Thomas (2006); Bratteli and Robinson (1987, 1981); Israel (2015); Ruelle (1999), and the assumption (ii) guarantees the sound thermodynamic limit of the Gibbs state. Importantly, a spatially ergodic state — in contrast to a temporarily ergodic state — is not necessarily a thermal equilibrium state, and thus our result is applicable to out-of-equilibrium situations.
To achieve an operationally robust notion of a thermodynamic potential, we resort to the resource theory of thermal operations. The resource theory of thermal operations is an established model for thermodynamics in the quantum regime Brandão et al. (2013, 2015); Goold et al. (2016); Binder et al. (2018). This approach allows us to study the thermodynamic behavior of arbitrary quantum states in a way that inherently accounts for the fluctuations in the work requirement of state transformations. This model for thermodynamics is tightly related to measures of information introduced in quantum information theory based on the quantum Rényi divergences Rényi (1960). Two quantities in particular, the Rényi-[math] divergence (or min-divergence) and Rényi- divergence (or max-divergence), play a special role in determining the work requirement of state transformations Lieb and Yngvason (2013); Weilenmann et al. (2016). For instance, the work that can be extracted from any state that is block-diagonal in the energy eigenspaces is given by the Rényi-0 divergence. For our main result, we consider the asymptotic version of these quantities for large system sizes, which corresponds to the thermodynamic limit. The asymptotic min and max Rényi divergences are also called the upper and lower spectral divergence rates in the theory of information spectrum, and we will use both terms interchangeably in this paper Han (2003, 2000); Nagaoka and Hayashi (2007); Datta and Renner (2009); Datta (2009); Bowen and Datta (2006a, b); Schoenmakers et al. (2007).
Main result
Our main result is that ergodic states can be reversibly interconverted into one another in the resource theory of thermal operations in the thermodynamic limit. Roughly speaking, if the Hamiltonian is local and translation invariant, then there exists a thermodynamic potential that is defined for all translation invariant and ergodic states on a lattice of spatial dimensions with the following property: For any two translation invariant and ergodic states , there exists a (generalized) thermal operation that can carry out the transformation by investing work at a rate of per subsystem and that uses a negligible amount of coherence per subsystem. Furthermore, is given by the KL divergence rate between and the Gibbs state of the Hamiltonian, divided by the temperature of the heat bath.
Our main result is proved in the following two steps. They are discussed in Section III and Section IV, where the main theorems are Theorem 2 and Theorem 3, respectively. Both of them can be of independent interest.
First, we prove that any state for which the min and max Rényi divergences coincide approximately Renner (2005); Tomamichel (2016) can approximately be converted reversibly to and from the Gibbs state by thermal operations, using a small source of quantum coherence Lostaglio et al. (2015a). In this case, the resource theory becomes reversible, i.e., the work required for a state transformation is equal to the negative work required for the reverse transformation. In consequence, if these divergences coincide to a single value in the asymptotic limit, then it defines a thermodynamic potential that completely characterizes the possible state transformations in the fully quantum regime. This is a result that applies broadly to the resource theory of thermal operations in general settings, even for states that are non-classical, i.e., that are not block-diagonal in the energy basis. This intermediate result, which is independent of the assumptions (i) and (ii), can be of independent interest.
Second, we prove that the min and max Rényi divergences indeed collapse to the KL divergence rate under the assumptions (i) and (ii). To this end, we prove a generalization of the quantum Stein’s lemma to the setting with (i) and (ii). The main idea of our proof, inspired by Refs. Bjelakovic and Siegmund-Schultze (2004, 2003), is to construct typical projectors that are adapted to the assumptions (i) and (ii). Our formulation uses semidefinite programming to simplify some parts of the proof.
Structure of the paper
In Section II, we introduce preliminary definitions and notation, including the relevant divergences and entropy measures. In Section III, we introduce our thermodynamic framework of thermal operations, giving a rigorous meaning to the work cost of a transformation from one state to another, and prove our first main theorem on asymptotic thermal operations (Theorem 2). In Section IV, we rigorously formulate ergodicity, and prove our second main theorem on the generalized quantum Stein’s lemma (Theorem 3). We conclude with remarks and an outlook in Section V. In the appendices, we remark on some technical lemmas, Gibbs-preserving maps, a more rigorous approach to ergodicity formulated using -algebras, an alternative proof of our second main theorem for the one dimensional case, and purely classical implications of our results.
II Preliminaries
Consider a Hilbert space of finite dimension , and let be the set of density operators (quantum states) on , satisfying and for . We also define the set of subnormalized states, which we denote by , and which is the set of all operators that satisfy . For two Hilbert spaces and representing systems and , we write when the Hilbert spaces are isomorphic; by convention, the identity mapping maps the canonical basis of onto the canonical basis of .
The set of quantum states carries a natural metric given by the trace distance Nielsen and Chuang (2000), defined as for any , where is the Schatten 1-norm. This metric can be extended to subnormalized states as the generalized trace distance Tomamichel et al. (2010); Tomamichel (2012), defined as
[TABLE]
We also define the fidelity Nielsen and Chuang (2000) as for any .
II.1 Entropy and divergence
Thermodynamic properties of microscopic quantum systems can be described using entropy measures that generalize the usual Shannon or von Neumann entropy to the so-called “one-shot” regime Renner (2005); Tomamichel (2012, 2016). More specifically, in the presence of thermodynamic reservoirs, we need to consider a family of relative entropies, or divergences. For and , the KL divergence (Rényi- divergence) is defined as:
[TABLE]
Throughout this paper, we assume that the first argument of the divergences considered (here ) lies within the support of the second argument (here ). This assumption is physically justified when is a Gibbs state, which necessarily has full rank. The min divergence (Rényi-[math] divergence), or the min relative entropy, is defined as
[TABLE]
where is the projection onto the support of . We also define an alternative measure of the min divergence (Rényi- divergence) as
[TABLE]
Finally, the max divergence (Rényi- divergence), or the max relative entropy, is defined as
[TABLE]
where is the operator norm.
These quantities are special cases of the Rényi- divergences. Here, we avoid technicalities and issues in the general definitions of the quantum Rényi divergences caused by the noncommutativity of the arguments Hiai et al. (2011); Wilde et al. (2014); Tomamichel (2016), by focusing on the quantities above which are sufficient for our purposes. These divergences satisfy
[TABLE]
From these divergences we can define corresponding entropy measures as the divergence with respect to the identity operator : For we define
[TABLE]
We note the following explicit forms of the von Neumann entropy (Rényi- entropy) , the max entropy (Rényi-[math] entropy) , and the min entropy (Rényi- entropy) ,
[TABLE]
The entropies are ordered as
[TABLE]
These divergences satisfy the data processing inequality, i.e., they are monotonous under the action of a completely-positive (CP) and trace-preserving (TP) map :
[TABLE]
For , see for example Lemma 7 of Ref. Datta (2009). The case of is equivalent to the strong subadditivity of the von Neumann entropy Nielsen and Chuang (2000); Lieb and Ruskai (1973). Consequently, the entropies do not decrease under the action of a CPTP map that is unital, i.e., ,
[TABLE]
A useful property of these divergences is a monotonicity property for the semidefinite ordering of the second argument: If , then for each ,
[TABLE]
The divergences obey a scaling property in the second argument. For , we have for any ,
[TABLE]
Under tensor product states, the divergences become additive. For , we have for any , ,
[TABLE]
To ensure that the operational quantities represented by these entropies and divergences do not significantly depend on events that only appear with vanishingly small probability, we “smoothe” these entropies and divergences over a ball of states that are close to the original state Renner (2005); Datta (2009). First, we define the -ball of states around a subnormalized state as
[TABLE]
Definition 1** (Smooth divergences Datta (2009)).**
The smooth divergences are defined as follows,
[TABLE]
The smooth entropies are defined correspondingly as
[TABLE]
We introduce a further convenient divergence (relative entropy) that is based on hypothesis testing Tomamichel and Hayashi (2013); Dupuis et al. (2013); Faist and Renner (2018). This divergence allows to interpolate between the min- and max-divergences in a different fashion than the Rényi entropies, along with a simple formulation and a collection of useful properties. For a subnormalized state and , we define for any ,
[TABLE]
The hypothesis testing divergence owes its name to the fact that if are two quantum states, represents the probability of mistakenly reporting in a hypothesis test between the two states, if we carry out a strategy that mistakenly reports with probability at most .
The hypothesis testing divergence satisfies the data processing inequality Wang and Renner (2012): For any subnormalized state , for any , for any CP and trace-nonincreasing map , and for any , the hypothesis testing divergence is monotonic,
[TABLE]
The hypothesis testing entropy also obeys a scaling property in the second argument: For any subnormalized state , for any , and for any ,
[TABLE]
as can be directly seen from (18). Also, for any for which , the hypothesis testing entropy satisfies
[TABLE]
for any subnormalized state and for any . Furthermore, if , then for some with and hence for any ,
[TABLE]
A useful property of the hypothesis testing divergence is that it interpolates between the min and max divergences, which are approximately recovered in the regimes and , respectively Dupuis et al. (2013):
Proposition 1**.**
Let be a (normalized) quantum state and let . For any ,
[TABLE]
Proof.
The proof of (Faist and Renner, 2018, Lemma 40) carries through even for the slightly different smoothing of and , except for the upper bound on . There, we may apply (Dupuis et al., 2013, Proposition 4.1) directly. ∎
Finally, we note a pair of inequalities which establishes the approximate equivalence of the two kinds of min-divergences Tomamichel et al. (2011); Dupuis et al. (2013); Tomamichel (2016).
Proposition 2**.**
Let be a normalized state and let . For any ,
[TABLE]
Proof.
The first inequality follows because of (6). For the second inequality, let such that . Then from (Dupuis et al., 2013, Proposition 4.2), we have for any ; choosing and using Proposition 1, we find . The claim follows by noting that along with . ∎
II.2 Asymptotic spectral divergence rates
In statistical mechanics one is often interested in the thermodynamic limit, where the behavior of the system as it becomes arbitrarily large often no longer depends on microscopic details. The action of taking the thermodynamic limit is formalized by considering a sequence of states , where is a quantum state on .
The von Neumann entropy rate is defined as
[TABLE]
and the KL divergence rate with respect to the sequence of positive operators is defined as
[TABLE]
We note that these limits do not necessarily exist in general.
We now introduce the spectral divergence rates, which are natural extensions of the min and max divergences to the thermodynamic limit.
Definition 2** (Spectral divergence rates).**
Let be a sequence of states and let be a sequence of positive operators. We define the upper spectral divergence rate,
[TABLE]
and the lower spectral divergence rate,
[TABLE]
These quantities have been introduced in Ref. Nagaoka and Hayashi (2007) in an equivalent but different expression:
[TABLE]
where \operatorname{Proj}\bigl{\{}\hat{X}\geqslant 0\bigr{\}} represents the projector onto the eigenspaces of corresponding to nonnegative eigenvalues. The equivalence of these two definitions has been proved in Theorems 2 and 3 of Ref. Datta (2009). We note that
[TABLE]
As a special case, we introduce the lower and the upper spectral entropy rates, which are respectively given by
[TABLE]
where is the sequence consisting of identity operators on .
We can also define the hypothesis testing divergence rate
[TABLE]
noting that the limit does not necessarily exist. From Proposition 1, in general, and respectively give the same lower and upper spectral divergence rates as those given by and :
[TABLE]
III Asymptotic state convertibility by thermal operations
In this section, we formulate thermal operations and prove our first main theorem on asymptotic state convertibility (Theorem 2). Importantly, in the microscopic regime, state transformations are not reversible in general, not even approximately. For general states , it might happen that can be approximately converted to with work extraction , but that an approximate transformation from to requires much more work than Horodecki and Oppenheim (2013).
Then we can ask the question, under which conditions is reversibility restored? This is an important question, because reversibility implies that the optimal work cost derives from a potential, which in turn means that macroscopic thermodynamic behavior is restored. Here, we consider in fact a marginally stronger property. Under which conditions is a state reversibly convertible to the thermal state? Clearly, any two states that have this property can reversibly be converted into one another. This slightly stronger statement ensures that the thermodynamic potential is well defined for the thermal state itself, a desirable feature that allows the thermal state to take on the role of a “reference state.”
III.1 Thermodynamic operations
We now introduce our thermodynamic framework. The simple model we introduce captures the relevant features of thermodynamics at the microscopic scale, while providing a simple, abstract, and general formalism for analyzing the resource cost of transforming one quantum state into another Goold et al. (2016).
The goal is the following. Given a system , and two states , we would like to quantify the resources required in order to convert to in some reasonable thermodynamic model. The resource theory of thermal operations is an established model that is particularly useful in such a context. It specifies the set of transformations that can be carried out for free, without the involvement of external resources such as thermodynamic work. In the model of thermal operations, one is allowed to carry out for free any unitary on the system and a heat bath at fixed background temperature, as long as the unitary commutes with the overall noninteracting Hamiltonian of the system and the bath. Here we introduce a slightly generalized notion of thermal operations, where different input and output systems are allowed.
Definition 3** ((Generalized) thermal Operation).**
Consider systems with corresponding Hamiltonians . Then a CP and trace-nonincreasing map is a thermal operation at inverse temperature if it can be written as
[TABLE]
for some ancilla system of finite dimension with some corresponding Hamiltonian , and for some partial isometry such that .
If there exists a thermal operation that maps to , we write . We may omit the Hamiltonians if they are clear from context.
Furthermore, a process that is achieved in the limit of processes of the form (34) with arbitrarily large but finite bath systems, is also called a thermal operation.
The last condition is required to enable processes that decrease the rank of the input state, for instance, a process consisting of Landauer erasure of a single bit compensated by a suitable energy shift Horodecki and Oppenheim (2013).
An operator is a partial isometry if it is an isometry on its support, or equivalently if and are projectors. We allow in the definition above to be a partial isometry instead of a unitary as considered in Refs. Brandão et al. (2013); Horodecki and Oppenheim (2013); Brandão et al. (2015) because they are more convenient when considering input and output systems of different dimension. Physically, this corresponds to specifying only a part of the process happening on an input subspace. Importantly, any partial isometry that conserves energy can be dilated to a full unitary that conserves energy on a larger system Faist et al. (2021), as illustrated in Fig. 1. We prove a corresponding general statement as Proposition 13 in Appendix 0.B.
There are no known general conditions under which state transformations are possible with thermal operations in the quantum regime. For semiclassical states, i.e. states that are block-diagonal in energy, such conditions are provided in the form of thermomajorization, a generalization of matrix majorization Horodecki and Oppenheim (2013).
Now we introduce an alternative model known as Gibbs-preserving maps. This model has a simple technical formulation which makes it more convenient to prove some properties. Because any thermal operation is in particular a Gibbs-preserving map, all properties obeyed by Gibbs-preserving maps are inherited by thermal operations. As for thermal operations, it is technically more convenient to consider trace-nonincreasing maps; furthermore we allow these maps to be Gibbs-sub-preserving in the sense of the following definition.
Definition 4** (Gibbs-sub-preserving map).**
Consider systems with corresponding Hamiltonians . Then a CP and trace-nonincreasing map is said to be a Gibbs-sub-preserving map for some fixed inverse temperature if
[TABLE]
When there exists a Gibbs-sub-preserving map that maps to , we write . We may omit the Hamiltonians if they are clear from context.
We note that any Gibbs-sub-preserving map can be dilated into a fully trace-preserving map on a larger system which furthermore has the thermal state as a fixed point (Faist and Renner, 2018, Proposition 2).
Lemma 1**.**
Any thermal operation is also a Gibbs-sub-preserving map.
Proof.
A thermal operation can be written in the form (34). We abbreviate as . Then with , we have
[TABLE]
where we have invoked Proposition 12 to see that commutes with (for the second equality) and that commutes with (for the final inequality). ∎
While any thermal operation is a Gibbs-sub-preserving map as shown in Lemma 1, the converse is not true Faist et al. (2015b). A notable difference between thermal operations and Gibbs-preserving maps is the way the two models handle coherent superpositions of energy states. Thermal operations cannot create any coherent superpositions of energy levels because they commute with time evolution. However, there exist Gibbs-preserving maps that can generate coherent superpositions of energy levels Faist et al. (2015b).
The divergences defined above play an important role in our thermodynamic framework as they are monotones under thermodynamic transformations. In the following, we exploit the scaling property (13) of the divergences to write the expression more compactly by absorbing the system free energy into the divergence term.
Proposition 3** (Monotonicity of divergences Horodecki and Oppenheim (2013); Brandão et al. (2015); Faist and Renner (2018); Tomamichel (2016)).**
Consider systems with corresponding Hamiltonians . If are (normalized) quantum states that satisfy , where stands for either TO or GPM, then
[TABLE]
where may be any of [math], , , or and where .
The proof of Proposition 3 is essentially an application of the data processing inequality (10). The full proof requires a dilation of the trace-nonincreasing map into a trace-preserving one, and it is presented in Appendix 0.B.
Now that we have specified the free operations, we need to specify how we can provide resources for thermodynamic operations that are not free, or how we can extract such resources from states.
Thermodynamic work can be provided with the help of an external work storage system, often called a “battery.” This can be any system which starts in a definite energy level and finishes in a different energy level; the difference in energy is then the amount of work furnished or extracted. In fact, a large collection of different battery models are equivalent Brandão et al. (2015); Faist and Renner (2018).
Thermal operations necessarily commute with the free time evolution, as can be seen from (34). This means that it is impossible to create any state that has a coherent superposition of energy levels, even with an arbitrary amount of work, without access to another resource that provides coherence Lostaglio et al. (2015a). Coherence is thus a valuable resource that should be accounted for Åberg (2014); Lostaglio et al. (2015a); Korzekwa et al. (2016); Winter and Yang (2016); Marvian (2020). Here, we adopt a rudimentary, ad hoc model. We suppose that we have access to an additional system initialized into a pure state of our choosing. Crucially, we assume that the range of energy values that can be stored into the system is bounded by some parameter , i.e., where is the Hamiltonian of . The system must be restored to a state that is close to a pure state. The bound on the norm of the Hamiltonian forbids any embezzlement of work of more than of the order of Brandão et al. (2015). The requirement that the final state on is close to a pure state is necessary because there is no constraint on the dimensionality of ; with a suitable highly degenerate system, starting from a pure state and finishing in the maximally mixed state would allow to extract an arbitrary amount of work that is not controlled by .
This crude model for accounting for coherence suffices for our purposes, as the protocols we construct only require an ancilla system with a parameter that is negligibly small compared to the overall work cost of the transformation. Note that this scheme differs from catalysis Brandão et al. (2015); Ng et al. (2015); Lostaglio et al. (2015b) as we do not require the final state to be related in any way to the initial state.
Definition 5** (Work/coherence-assisted process).**
Consider systems with corresponding Hamiltonians and let stand for TO or GPM. We say that a CP and trace-nonincreasing map is a -work/coherence-assisted operation, if there exist systems with respective Hamiltonians satisfying , , and if there exist two energy eigenstates of respectively whose energies and satisfy , and if there exist two pure states , and if there exists a operation , such that
[TABLE]
Here, we allow infinite-dimensional Hilbert spaces for and for technical reasons related to how to construct states.
A -work/coherence-assisted thermal operation is thus simply a free process that is assisted by ancillas that provide an amount of work and an “amount of coherence” that is at most . If is negative, then this measures the amount of work that is extracted by the process.
Definition 6** (Approximate thermodynamic process using work and coherence).**
Consider systems with Hamiltonians and let stand for TO or GPM. We say that the state is -transformable into by a process, which we denote by , if there exists a -work/coherence-assisted process such that . We may omit the Hamiltonians if they are clear from context.
The hypothesis testing divergence is a relatively good (quasi) monotone under assisted thermodynamic operations: It can only decrease, except for correction terms that depend on . Because the proof is not particularly insightful, we defer it to Appendix 0.B.
Proposition 4** **(Quasi-monotonicity of the hypothesis testing divergence
under resource-assisted transformations).
Consider systems with respective Hamiltonians . For a quantum state and a subnormalized state , suppose , where stands for TO or GPM. Then for any ,
[TABLE]
Finally, we define asymptotic transformations. These are transformations in the thermodynamic limit for which we are interested in the work cost rate, and which use only a sublinear amount of coherence.
Definition 7** (Asymptotic thermodynamic process).**
Consider two sequences of states and and two sequences of Hamiltonians , . Let stand for TO or GPM. We say that can be asymptotically transformed into by an asymptotic process at a work rate , which we denote by , if there exists sequences such that for all and such that
[TABLE]
The spectral rates are monotones under asymptotic transformations:
Proposition 5** (Monotonicity of spectral rates Bowen and Datta (2006a)).**
Consider two sequences of states and and two sequences of Hamiltonians , . Define the sequences of Gibbs weight operators and . Let be such that where may stand for either TO or GPM. Then
[TABLE]
Proof.
This follows by applying Proposition 4 and taking the asymptotic limit using the expressions (33) of the asymptotic divergences. ∎
The monotonicity of the spectral rates implies that if a transformation is reversible at a given work cost rate, then that rate is necessarily optimal:
Proposition 6**.**
Consider two sequences of states and and two sequences of Gibbs weight operators and . Then if is such that and , then for all , .
This is an expression of the second law of thermodynamics, or Kelvin’s principle, which states that one cannot extract a positive amount of work from a single heat bath by a cyclic protocol.
III.2 State convertibility by thermal operations
We now describe our main theorem for state convertibility by thermal operations. We first derive a sufficient condition for state conversion which is applicable to non-asymptotic cases. We then take the asymptotic limit and obtain a necessary and sufficient condition for asymptotic state conversion. The proofs of these theorems will be provided in the next subsection because of their technical nature.
First, we provide a new sufficient criterion for when a general non-semiclassical state can be approximately reversibly converted to the thermal state using thermal operations. Because thermal operations cannot create superpositions of energy eigenstates, arbitrary state transformations generally require a source of coherence. Here, we show that for any state whose min- and max-divergences are close, only a small source of coherence is needed to carry out a transformation to Gibbs state.
Theorem 1**.**
Let be any quantum state on a system with Hamiltonian , and denote by the spectral range of , i.e., the difference between the maximum and minimum eigenvalue of . Let be the trivial thermal state on a trivial system with Hilbert space with Hamiltonian . Let . Suppose that there exists and such that
[TABLE]
Let , , and . Then we have
[TABLE]
and
[TABLE]
Theorem 1 allows us to prove the emergence of a thermodynamic potential in the macroscopic regime. That is, there is a single quantity that characterizes exactly when a transformation by an asymptotic thermal operation is possible.
Theorem 2**.**
For sequences of states , and sequences of Hamiltonians , . Suppose that the spectral rates collapse for these states into a single monotone, i.e.:
[TABLE]
with the sequences and . Then
[TABLE]
Equivalently, if and only if .
Crucially, these theorems are applicable even if the state is fully quantum. On the other hand, if the state is semiclassical, i.e., if it is block-diagonal in the energy basis, then the condition for state convertibility in Theorem 1 reduces to the known conditions of Refs. Åberg (2013); Horodecki and Oppenheim (2013) in terms of state preparation and work distillation as characterized, e.g., by thermo-majorization. In such cases, no source of coherence is required.
Indeed, for semiclassical states, the min-divergence quantifies the amount of work that can be extracted from a state when transforming it to the thermal state and the max-divergence quantifies the amount of work that is required to prepare the state out of the thermal state. If these divergences collapse, the state is reversibly convertible to and from the thermal states. For quantum states that are not semiclassical, the proof cannot proceed in the same way: Preparing a general state starting from the thermal state requires an external source of coherence, and thus the work requirement of state preparation cannot be given by the max-divergence in same way as for semiclassical states. For the proof of Theorem 1 we need the fact that the min and the max divergences collapse approximately in order to conclude that the state can be approximately reversibly transformed to and from the thermal state.
Theorem 2 generalizes and unifies several known situations. For i.i.d. states and Gibbs-preserving maps, our theorem reproduces the results of Ref. Matsumoto (2010). In the case of a trivial Hamiltonian, we recover the results of Ref. Jiao et al. (2018). Our theorem also provides a concrete application of the general results provided in Refs. Weilenmann et al. (2016); Weilenmann (2017); Weilenmann et al. (2018), in the context of the axiomatic thermodynamic framework of Lieb and Yngvason Lieb and Yngvason (1999, 2013).
We note that reversibility only applies to the leading order of the work cost rate and coherence rate. Consider two sequences of states that satisfy , which are asymptotically reversibly interconvertible thanks to Theorem 2. It is still in general necessary to invest a sublinear amount of work and coherence in the transformation which cannot be recovered in general in the reverse transformation . In our definition of an asymptotic transformation (Definition 7) we deliberately allow sublinear work and coherence costs for this reason, noting that these quantities are negligible with respect to the overall work cost of the transformation.
III.3 Proof of Theorems 1 and 2
Here we provide the proof of Theorem 1 and its asymptotic counterpart, Theorem 2. We proceed in sequential steps through several lemmas: Theorem 1 is proved through Section III.3.1 to Section III.3.4, and Theorem 2 is proved in Section III.3.5.
In order to simplify the notation and ease readability, we omit the hat symbols on operators in this subsection.
III.3.1 Discretizing the Hamiltonian
The first simplification that we do is to change the Hamiltonian from to a slightly different Hamiltonian where the eigenvalues are “coarse-grained” into blocks. That is, given , we subdivide the spectrum of into bins of width , where is the spectral range of , and we then clamp all eigenvalues in the bin to a single value which is a multiple of . This yields a Hamiltonian with and . Furthermore, only has distinct eigenvalues, which we denote by ; let also denote the projectors onto the corresponding eigenspaces. We may thus write
[TABLE]
with for some fixed .
Physically, the transformation can be done by turning on a perturbation of magnitude at most . Furthermore, the perturbation commutes with the original Hamiltonian.
We note that and , where the operator inequalities hold because both sides commute with each other. This implies that, for any and for any , we have
[TABLE]
We also define the dephasing operation for any Hermitian operator as a pinching in the energy blocks:
[TABLE]
The following proposition asserts that this perturbation can be carried out with a -work/coherence-assisted thermal operation, for any value of which impacts the accuracy of the process as .
Proposition 7**.**
Consider a system with Hamiltonian and a copy with a Hamiltonian . Suppose that and let such that \bigl{\lVert}{{\mathrm{id}}_{S\to S^{\prime}}(H_{S})-H^{\prime}_{S^{\prime}}}\bigr{\rVert}_{\infty}\leqslant\delta. Then for any there exists a -work/coherence-assisted transformation such that for any state (with any reference system ), we have
[TABLE]
Proof.
Let be a simultaneous eigenbasis of and of , and write . Then and for corresponding eigenvalues and including multiplicities, i.e., the (resp. ) need not be all different. The condition implies that .
Let . Let , be a particle on the intervals , in , respectively, which are described by the Hilbert spaces , . There are natural embeddings .
Let be the indicator function for a closed interval . We define the Hamiltonians of and by and , which are regarded as self-adjoint operators acting on and , respectively. Obviously, , .
We also define the initial state of by . We can also regard as an element of , for which we use the same notation.
For with , we define the translation operator by . This is an isometry, where its adjoint is defined on by for , because
[TABLE]
Now we define the isometry
[TABLE]
We can show that by acting with on for any . Then, we define the CP and trace-nonincreasing map
[TABLE]
By construction, is a -work/coherence-assisted thermal operation.
Let be any state with any reference system. Without loss of generality, assume that is in fact a pure state (or consider a larger reference system ; the statement will still hold because trace distance can only decrease under partial trace). We remark that the fidelity and the trace distance can be defined for infinite-dimensional and Hilbert spaces, and satisfy the same fundamental properties as in finite dimensions Belavkin et al. (2005); Furrer et al. (2011). Then, with ,
[TABLE]
where the term on is real because is real. We can calculate for
[TABLE]
Hence, since ,
[TABLE]
Recalling that , and that the fidelity can only increase under partial trace, we have
[TABLE]
∎
III.3.2 Manipulating coherence in the state
For any state on any system with any Hamiltonian , we can decompose into modes of coherence Lostaglio et al. (2015a) as
[TABLE]
where are general operators satisfying
[TABLE]
for all . The are simply the off-diagonal elements of that connect two energy levels that differ by . For the Hamiltonian constructed in (47), with only energies that are multiples of , we have that the in (58) range over all possible differences of energies in , i.e., over all multiples of .
The following lemma states that if the large coherence modes in the state are suppressed, then it is possible to carry out the dephasing operation by mixing only a few differently time-evolved versions of .
Lemma 2**.**
Let be any state on any system with a Hamiltonian whose energies are multiples of as in (47). Let denote the coherence modes in the decomposition of as above. Let . Suppose that there exists such that for all with we have
[TABLE]
Define
[TABLE]
Then, if denotes the number of distinct eigenvalues of , we have that
[TABLE]
Proof.
For any , we write
[TABLE]
such that
[TABLE]
Recall that in the modes decomposition of is a multiple of and ranges over all off-diagonals of ; i.e., for . Furthermore, we may split the sum over the modes as a sum over modes in and a separate sum over the higher order modes. We can thus calculate:
[TABLE]
where we recall that and where we have defined as the second sum in the before-to-last line. We can bound the norm of as follows:
[TABLE]
where is a crude upper bound for the total number of terms in the first sum, and where each term is individually bounded thanks to the assumption (60). We may conclude that and are close in trace distance:
[TABLE]
∎
Importantly, the min- and max-divergences are only known to quantify the extractable work and the work cost of formation for semiclassical states, i.e., those that commute with the Hamiltonian. For states that are not semiclassical, we need a more general statement. Here, we show a lemma that shows that the min- and max-divergences also accurately quantify the extractable work and the work cost of formation for general quantum states, as long as their large coherence modes are suppressed.
Lemma 3**.**
Let be any quantum state on a system with a Hamiltonian whose energies are multiples of as in (47), and let . Let be the thermal state of a trivial system with Hamiltonian as in Theorem 1. Suppose that there exists such that for any with we have
[TABLE]
Then, for any , we have
[TABLE]
Conversely, for any integer , we have
[TABLE]
Proof.
First, note that (67) asserts that the coherence modes of are small for large . More precisely: Let , such that . Then for all such that , we have
[TABLE]
because the coherence modes are simply the combination of all the blocks in the -th off-diagonal of , whose individual norm is bounded by our assumption (67). We may invoke Lemma 2 to deduce that
[TABLE]
where is defined in Lemma 2 with and .
Work extraction from . Now we construct a strategy to transform into the trivial thermal state . First, we decohere the state in the energy blocks, effecting the transformation at no work nor coherence cost (this can be done by averaging over time, which is a thermal operation). Then we apply the incoherent work extraction protocol (Proposition 15 in Appendix 0.B) to transform with an error parameter , while extracting an amount of work equal to , and at no coherence cost. Hence, we have . Using Proposition 2, observe that
[TABLE]
since is a candidate in the optimization that defines the smooth min-divergence. Then we invoke the property of the fidelity that (cf. (Audenaert and Mosonyi, 2014, Lemma 4.9)), to see that
[TABLE]
With the crude bound we finally see that
[TABLE]
which shows (68).
Formation of the state . We now devise a procedure to construct the state starting from the trivial thermal state . In the following, we refer to the system as , and write and as and .
The full protocol consists in three steps. The strategy will be to prepare a completely incoherent state on the system along with an ancilla system in such a way that the system serves as a reference frame that can be used to induce coherence in . Then, in the second and third steps, we “externalize” the reference frame by using to “induce” the necessary coherence modes in Bartlett et al. (2007).
Let be an integer. Let be an ancilla system of dimension and with a Hamiltonian consisting of evenly -spaced levels, i.e., . Define the state by
[TABLE]
By we will denote the joint dephasing operation on and , i.e., the dephasing in the common global energy eigenspaces of .
In the first step of the protocol, starting from the trivial thermal state on , we prepare the state at a cost given by the max-divergence
[TABLE]
We can bound this as follows. The max-divergence can only decrease under the dephasing operation; we have because with being the identity operator of ; finally, the max-divergence is additive for tensor product states. This gives us
[TABLE]
noting that because is a pure state. Therefore:
[TABLE]
The next steps are to “consume” in order to induce on the system (we need to externalize the reference frame). This is done as follows.
In preparation for the further steps, we first note that if we post-select the reference frame in being in the state , then we induce the correct state on , approximately. This is shown as follows:
[TABLE]
where we used the fact that unless , and that \operatorname{tr}\bigl{(}\eta_{C}^{(-k\delta)}\eta_{C}^{(k\delta)}\bigr{)}=(d_{C}-\lvert{k}\rvert)/d_{C}^{2} since is the matrix of all zeros except for the -th off-diagonal in which all entries are equal to . Then
[TABLE]
where in the last line we used (70). Let be the matrix in which the -th off-diagonal is filled with the entries equal to , up to the -th off-diagonal, and the remaining matrix elements are zero. Then we note that
[TABLE]
where denotes the Hadamard (entry-wise) product. We note that , and that (Suppl. Lemmas 3 and 4 of Åberg (2014), originally from Horn and Johnson (1985)). Hence, and we finally have
[TABLE]
We also note that passes through orthogonal states for each time steps . Actually, for , the set \bigl{\{}\lvert{n}\rangle_{C}\bigr{\}}_{n} forms an orthonormal basis of , where . Indeed,
[TABLE]
Step 2 of our protocol consists in flattening the Hamiltonian of so that we can perform nontrivial unitaries without worrying about coherences. From the state with Hamiltonian , we “flatten” the Hamiltonian of the ancilla system using (Faist et al., 2021, Lemma 8.1) and consuming an additional ancilla of dimension , with the Hamiltonian being bounded as and with the original state surviving up to precision . That is, we achieve the following Hamiltonian transformation
[TABLE]
Finally, in Step 3 we carry out the following energy-conserving unitary controlled on the system :
[TABLE]
and we then use Landauer erasure to reset to a pure state and to trace it out. Note that because the dephased state is invariant under time evolution. Then, the application of the unitary to , and tracing out , yields
[TABLE]
Recalling (81), we know that this state is close to the required . Noting that we need work to reset to a pure state, we find:
[TABLE]
Note that the final uniform Hamiltonian on the system can be restored to the original Hamiltonian at no work or coherence cost, by keeping the state of at a pure state of constant energy and changing the other levels to match those of the original Hamiltonian .
Combining together these three steps, we see that
[TABLE]
Recalling while assuming , we obtain (69). ∎
III.3.3 Collapse of the min and max divergences suppresses coherence
Here we show that the difference between (alternative) min-divergence and the max-divergence is a quantity that provides a characterization of how much coherence there is in the state. Namely, if the divergences do not differ by more than , then the one-norm of off-diagonal energy blocks is exponentially suppressed in as long as .
Lemma 4**.**
Let be a quantum state. Suppose there are and such that
[TABLE]
Then for any , we have
[TABLE]
Proof.
Using Hölder’s inequality, we have
[TABLE]
By definition of the Rényi-1/2 divergence, we have for any ,
[TABLE]
and hence
[TABLE]
On the other hand, we have
[TABLE]
recalling that the square of the largest singular value of a matrix is the maximum eigenvalue of . Putting these together, and noting that the same argument holds if we exchange and , we obtain
[TABLE]
as claimed. ∎
III.3.4 Proof of Theorem 1
Finally, we can prove Theorem 1. If the smooth min and max Rényi divergences coincide approximately, we use the above lemmas to conclude that there exist protocols for work distillation and state formation with approximately matching work costs. The difficult part of the proof is to show that there is a single state that is a good enough smoothing candidate simultaneously in both (16a) and (16b).
Proof.
First, we need to connect the assumption on the smoothed entropy measures to a specific state which has a small gap between its non-smoothed min and max-divergences. Our specific goal below is to construct a state that satisfies the conditions of Lemma 4 and is sufficiently close to .
Because and , we have
[TABLE]
Both protocols, work extraction and state formation, start by shifting the Hamiltonian , and at the end shifting the Hamiltonian back . Thanks to Proposition 7, this can be done at a cost in the total coherence parameter of and at a precision cost in each way.
Let be the optimal subnormalized quantum state for , satisfying and .
Let and write
[TABLE]
Let , denote optimal choices in the last optimization. Let
[TABLE]
where denotes the dephasing operation in the eigenspaces of . Then, using the pinching inequality, and because commutes with time evolution,
[TABLE]
and thus . Shifting back the normalization of the second argument gives
[TABLE]
because the optimal state in the last max-divergence is a candidate in the optimization for . Also, taking the trace of the constraint we obtain , and then using (Dupuis et al., 2013, Lemma A.4), we have (using ), where is the purified distance for . Hence, and thus .
On the other hand, we have
[TABLE]
using Hölder’s inequality. Conveniently, by construction, and thus also and , and \bigl{\lVert}{\gamma^{\prime-1/2}G^{\dagger}\gamma^{\prime 1/2}}\bigr{\rVert}_{\infty}=\bigl{\lVert}{G}\bigr{\rVert}_{\infty}\leqslant 1, since is a contraction (because ). Hence
[TABLE]
and thus
[TABLE]
Finally, we define
[TABLE]
We have , and thus
[TABLE]
and by a chain of triangle inequalities
[TABLE]
We can define , while noting that as . Then, the state satisfies
[TABLE]
We then have and .
Then, the conditions of Lemma 4 are fulfilled, and for any , we have that
[TABLE]
Now, for any we set . For any with , Equation 112 tells us that \bigl{\lVert}{P_{k}\tilde{\rho}P_{k^{\prime}}}\bigr{\rVert}_{1}\leqslant{e}^{-(r-1)\Delta^{\prime}}=:\xi^{\prime}. We set in the following for convenience.
The conclusions of Lemma 3 apply to the interconversion of to and from the thermal state.
Distilling work from . Work can be distilled, i.e., the transition is possible, with the parameters (we have set in Lemma 3)
[TABLE]
Preparing the state . The state can be prepared, i.e., the transition is possible, with the parameters
[TABLE]
Finally, letting , we obtain the slightly simplified parameters in Theorem 1.
∎
III.3.5 Proof of Theorem 2
We now present the proof of Theorem 2, the main theorem of the first part of our main result. The proof proceeds by applying Theorem 1 in the thermodynamic limit.
Proof.
We use Theorem 1 to show asymptotic convertibility of (relative to ) to and from the Gibbs state on a trivial system at zero energy. We write the trivial sequence of trivial Gibbs states. For , let
[TABLE]
and let . We have
[TABLE]
For and for each , we apply Theorem 1 with the choices , , and . Then . Observe that and that increases at least as fast as by definition; thus . Let be the parameters of the work extraction process given by Theorem 1 for these choices. Then
[TABLE]
and we can apply Lemma 13 in Appendix 0.A to conclude that .
For the work extraction process, we define , , , and similarly. Then the parameters given by Theorem 1 satisfy
[TABLE]
where we used the fact that sublinear terms are suppressed, that for any , and that because grows at least as fast as and the exponential takes over the polynomial. Thus from Lemma 13 in Appendix 0.A we see that . Combining these two processes for different states immediately yields .
It is clear that if , then from the above, using Property (e) of Proposition 14 in Appendix 0.A. Also, if , then monotonicity of the spectral rates imply that . ∎
IV Collapse of the min and max divergences for ergodic states relative
to local Gibbs states
In this section, we prove the second main theorem of our main result (Theorem 3): For any that is translation-invariant ergodic and for any local translation-invariant Gibbs state , then we have . Combined with Theorem 2, this implies that all such states can be reversibly converted into one another with thermal operations and a negligible amount of coherence.
We prove this assertion in two steps. First, we formulate a generalized version of Stein’s lemma Nagaoka and Ogawa (2000); Bjelakovic and Siegmund-Schultze (2004); Nagaoka and Hayashi (2007); Bjelakovic and Siegmund-Schultze (2003); Brandão and Plenio (2010). We derive a sufficient condition for the min and max divergence converge to the same value that is heavily inspired by these references. The condition is the existence of an operator obeying a simple set of properties, that plays the role of a typical projector. In a second step, we prove that for ergodic states and local Gibbs translation-invariant states, this condition is fulfilled.
IV.1 A sufficient condition for quantum Stein’s lemma
The quantum Stein’s lemma relates to a hypothesis test between two states and using a single measurement. If we employ the optimal strategy that correctly reports with probability at least , then the probability of erroneously reporting decreases exponentially as , with the rate being given by the KL divergence. This statement holds in several known cases, such as for i.i.d. states, or if is ergodic and is i.i.d. Bjelakovic and Siegmund-Schultze (2004).
Quantum Stein’s lemma can be formulated in terms of the hypothesis testing divergence. For sequences , , a quantum Stein’s lemma would state that for all ,
[TABLE]
Because the hypothesis testing divergence is monotonic in , and because it interpolates between the min and max divergences [cf. Eq. 33], we see that the hypothesis testing divergence converges to the KL divergence as per (125), if and only if the min and max divergences converge to the KL divergence,
[TABLE]
Therefore, to prove (126) for a class of states it suffices to prove (125).
A simplest situation where the quantum Stein’s lemma holds is the i.i.d. setting, i.e., and . In this situation, for any ,
[TABLE]
and consequently,
[TABLE]
as was proved in (Nagaoka and Hayashi, 2007, Theorem 2).
We now derive a sufficient condition for the convergence (125), providing a generalization of the quantum Stein’s lemma beyond i.i.d. states.
Lemma 5**.**
Let and be any sequences of states. Suppose that there exists such that for any , there exists a sequence of operators that satisfy, for sufficiently large ,
[TABLE]
Then, for any , we have
[TABLE]
Our proof is based on tools from semidefinite programming Watrous (2009); Dupuis et al. (2013); Faist and Renner (2018), which imply that the hypothesis testing divergence is equivalently expressed using two different optimizations:
[TABLE]
The optimizations are called the primal problem and dual problem respectively. We note that our proof below only requires the so-called weak duality between the minimization and the maximization, which states that the optimal value of the minimization problem is an upper bound to the optimal value of the maximization problem.
The reason that we have equality in (131) is that for the hypothesis testing divergence, the stronger notion of strong duality holds, which states that both optimization problems have the same optimal value. We note that the reason we write a supremum for the dual problem is that for , even as strong duality holds, we are not guaranteed that the supremum is achieved by a specific choice of and . In the primal problem the minimum is always achieved. This can be seen using Slater’s conditions Watrous (2009), noting that we can restrict the optimization to the support of .
Proof of Lemma 5.
Our proof proceeds by exhibiting explicit candidates in both optimizations in (131), yielding upper and lower bounds that both converge to as .
Let . From condition (129d) and Lemma 9 (a) in Appendix 0.A, we have
[TABLE]
which implies that for any , we have \operatorname{tr}\bigl{[}\hat{Q}_{n}^{\varepsilon}\hat{\rho}_{n}\bigr{]}>\eta for sufficiently large , and is a valid optimization candidate in (131). Using (129b), the value attained by this candidate is
[TABLE]
and thus
[TABLE]
By taking and then , we conclude that .
Now we consider the second optimization in (131). First, we note that using a generalization of the Pinching inequality (Lemma B.1 of Ref. Faist et al. (2021)),
[TABLE]
Let and . From inequality (135) and condition (129c), we have , and hence are valid optimization candidates in the maximization in (131). From Lemma 9 (b) in Appendix 0.A, we have \operatorname{tr}\bigl{[}{(\hat{I}-\hat{W}_{n}^{\varepsilon\dagger})\hat{\rho}_{n}(\hat{I}-\hat{W}_{n}^{\varepsilon})}\bigr{]}\to 0 as , and therefore, for sufficiently large , we have \operatorname{tr}\bigl{[}{(\hat{I}-\hat{W}_{n}^{\varepsilon\dagger})\hat{\rho}_{n}(\hat{I}-\hat{W}_{n}^{\varepsilon})}\bigr{]}\geqslant\eta/4. Therefore, for sufficiently large ,
[TABLE]
The value attained by the maximization is then
[TABLE]
Dividing by , taking and then , we deduce that . ∎
In fact, one can see that the product of two typical projectors constructed in Ref. Bjelakovic and Siegmund-Schultze (2003) for the i.i.d. case satisfies the conditions (129a)–(129d) above, with .
IV.2 Formulation of ergodic states and local Gibbs states
In a second step of our main result, we consider ergodic states and local Gibbs states. Here we show that for these states, it is possible to construct an operator that satisfies the conditions in Lemma 5, in turn proving the collapse of the min and max divergences to the KL divergence.
The standard way to rigorously formulate ergodicity invokes infinite-dimensional -algebras Bratteli and Robinson (1987, 1981); Ruelle (1999). Here, for the sake of broad readability, we introduce the relevant concepts directly in an equivalent — albeit perhaps less elegant — formulation that does not require the use of algebras. For completeness, we provide the construction based on algebras in Appendix 0.C.
We consider a spatially -dimensional system on the lattice . To each site , we assign a copy of a finite-dimensional Hilbert space, such that the Hilbert spaces for all sites are isomorphic. We denote the set of operators acting on by . For a bounded region , we define and . We note that these are finite-dimensional spaces because is bounded.
For a bounded region , we consider a density operator whose support is , i.e., . We assume that we are given a collection for all bounded subregions of the lattice, which furthermore obey the consistency condition, namely,
[TABLE]
This condition is necessary to ensure that all are obtained from a common global state defined on the entire infinite lattice (see Appendix 0.C).
Consider now a sequence of bounded regions of the lattice defined as follows. For any , let and . We define the sequence of quantum states by , where we set . While with does not run over all of the elements of , it does not affect our following argument; indeed, it is straightforward to complete the sequence with intermediate states for all such that the limits that we derive are unaffected.
Before we can formulate ergodicity, we consider the shift superoperator. The shift superoperator is defined such that for any local operator whose support is , it is mapped by to the same operator at site , i.e., , where we regard as a -dimensional vector with the standard addition for such vectors.
Definition 8** (Translation invariance).**
A sequence of the form above is translation invariant, if it satisfies the consistency condition (138), and for all , all with being bounded, and all satisfying , we have
[TABLE]
We note that “translation invariant” is often referred to as “stationary” in the context of ergodic theory. In our setup, we interpret as a coordinate of the spatial potition instead of time, and therefore we prefer the denomination “translation invariant.”
Translation invariance is a central ingredient for the definition of ergodicity:
Definition 9** (Ergodicity).**
A sequence is translation-invariant and ergodic, if it is translation invariant, and for all self-adjoint for a bounded region we have
[TABLE]
where on the left-hand side is taken such that for all .
The limit on the right-hand side of (140) is not actually necessary, because the consistency condition (138) implies that does not depend on for large satisfying . The equivalence of this definition and the standard definition is proved in Appendix 0.C.
This definition implies that the variance of the shift average (i.e., the spatial average) of any local observable vanishes in the thermodynamic limit. We emphasize that an ergodic state can be out of equilibrium, because ergodicity is defined with respect to the spatial shift instead of time evolution.
We now define the Hamiltonian of the system which determines the Gibbs state. Let be a local operator describing interaction, whose support is a bounded region around site . More precisely, we assume that the support of is in , where is an integer and , describe the -th components of (). We note that represents the interaction length, where describes non-interacting cases.
Then, for a bounded region , the truncated Hamiltonian is given by
[TABLE]
A Hamiltonian of this form is referred to as a local Hamiltonian. The Hamiltonian is translation invariant, if it can be written in the form
[TABLE]
for some fixed operator .
Let be the inverse temperature. The truncated Gibbs state on a bounded region is given by the density operator
[TABLE]
where is the truncated free energy. We note that does not satisfy the consistency condition (138), because of the effects on the edges of the region where we have truncated the Hamiltonian.
We consider a sequence of the truncated Gibbs states: We define with , where and . We note that, with this definition, the supports of and are the same. In the following we use the shorthands and .
IV.3 Generalized Stein’s lemma for ergodic states relative to
local Gibbs states
We now consider a proof of a generalization of the quantum Stein’s lemma for ergodic states relative to local Gibbs states. We begin by proving that the limiting KL divergence is well defined:
Lemma 6**.**
Suppose that is translation invariant and is the truncated Gibbs state of a local and translation-invariant Hamiltonian in any dimensions. Then exists.
Proof.
This follows from the following well-known facts. From Eq. 143,
[TABLE]
The first term on the right-hand side converges to because is translation invariant (Proposition 6.2.38 of Ref. Bratteli and Robinson (1981)). It is also known that the second term converges to the free energy density (Theorem 6.2.40 of Ref. Bratteli and Robinson (1981)). The third term also converges, because is local and translation invariant, and is translation invariant. ∎
One important ingredient in the proof of our generalization of the quantum Stein’s lemma is the following typical projector for ergodic states (Theorem 2.1 of Ref. Bjelaković et al. (2004); see also Theorem 5.1 of Ref. Bjelaković and Szkola (2005) and Theorem 1.4 of Ref. Ogata (2013)).
Proposition 8** (Quantum Shannon-McMillan Theorem).**
Suppose that is ergodic. Then for any there exists a sequence of projectors (called typical projectors) that satisfy, for sufficiently large ,
[TABLE]
where .
We now consider our main theorem for ergodic states and for the truncated Gibbs state.
Theorem 3** (Collapse of the spectral rates for the truncated Gibbs state).**
Consider a lattice of spatial dimension and suppose that is translation invariant and ergodic, as in Section IV.2. Let be the sequence of truncated Gibbs states of a local and translation invariant Hamiltonian on the lattice. Then, for any ,
[TABLE]
and as a consequence,
[TABLE]
Proof.
From the proof of Lemma 6, the following limit exists,
[TABLE]
Let . We define relative typical projectors (as inspired by Refs. Bjelakovic and Siegmund-Schultze (2004, 2003)) as
[TABLE]
which satisfy by definition
[TABLE]
We then define
[TABLE]
The remainder of the proof is devoted to showing that the operator satisfies the four conditions (129a)–(129d) in Lemma 5 with
[TABLE]
These conditions then immediately imply Eq. 148, as discussed in Section IV.1.
The condition (129a) is clear by definition. Condition (129b) is obtained from inequalities (146) and (152) as
[TABLE]
The third condition (129c) is obtained from inequalities (145) and (152) as
[TABLE]
The final condition (129d) follows from Lemma 8 in Appendix 0.A, Eq. 147 in Proposition 8, and from
[TABLE]
To show Eq. 157, we use the assumption of ergodicity of . Since the Hamiltonian is local and translation invariant, we have
[TABLE]
where is the shift operator. Then, denoting by the projection operator onto a subspace satisfying the corresponding condition, we have
[TABLE]
where h:=\lim_{n\to\infty}\frac{1}{n}\operatorname{tr}\bigl{[}\hat{\rho}_{n}\hat{H}_{n}\bigr{]} and . For sufficiently large , we have , and therefore,
[TABLE]
By definition of ergodicity, observables of the form (158) converge in probability; we have proven Eq. 157. ∎
The above proof reduces to the main theorem of Ref. Bjelakovic and Siegmund-Schultze (2004) in the special case where is i.i.d., i.e., if the system has a strictly local Hamiltonian with no interaction terms ().
Finally, we can ask whether the same theorem holds also for the sequence of reduced states of the full Gibbs state on the infinite lattice. We show that this is indeed the case. Because this theorem requires a rigorous formulation in terms of -algebras, we defer the precise claim and proof to Theorem 4 in Appendix 0.C .
IV.4 Remarks on ergodicity, mixtures, and the
KL divergence
IV.4.1 The mixing property
A local Gibbs state with a mixing (or clustering) property is ergodic. However, we emphasize that the converse is false; ergodicity does not necessarily imply that the state can be written as a Gibbs state of a local Hamiltonian.
Definition 10** (Mixing).**
Let be the shift operator corresponding to the one-step shift to the -th direction (). A sequence has the mixing (or clustering) property, if it satisfies the consistency condition (138), and if for all with being bounded and if for all , we have
[TABLE]
where on the left-hand side is taken such that the supports of and are included in .
The equivalence of this definition and the standard definition is proven in Appendix 0.C. It is well-known that mixing implies ergodicity (cf. Ref. Ruelle (1999)):
Proposition 9**.**
Any translation-invariant and mixing state is ergodic.
For local operators and the Gibbs state of a local and translation-invariant Hamiltonian, a stronger property called the exponential clustering property has been proven for any in one dimension Araki (1969) and in higher dimensions for sufficiently high temperature (see, for example, Ref. Tasaki (2018) and references therein). Therefore, the quantum Stein’s lemma is proved for two local Gibbs states and at least for sufficiently high temperature.
IV.4.2 Mixtures of ergodic states
Consider now the situation in which the state is a mixture of different ergodic states. In this setting, ergodicity is broken, and the existence of a thermodynamic potential is no longer guaranteed.
Let be ergodic states (), and consider their mixture with , where and . We continue to suppose that is given by the Gibbs state of a local and translation-invariant Hamiltonian. In this setting, we can show that the min and max divergences are given by the minimal and maximal value of the KL divergence of the states in the mixture, respectively:
Lemma 7**.**
The spectral divergence rates are split as
[TABLE]
while the KL divergence rate is given by
[TABLE]
Proof.
Equation 162 immediately follows from Proposition 11 in Appendix 0.A. To prove (163), we note that -\operatorname{tr}\bigl{[}\hat{\rho}^{(k)}_{n}\ln\hat{\sigma}_{n}\bigr{]} is additive with respect to , and thus we only need to show . This in turn follows from the fact that the von Neumann entropy satisfies the following inequalities, . ∎
IV.4.3 The role of the KL divergence for the thermodynamic
potential
Usually, we have that if the min and max divergences coincide, then the limiting values coincide with the limiting value of the KL divergence. This is because in usual cases, the asymptotic divergences obey
[TABLE]
Indeed, this inequality follows in usual cases from the fact that combined with a continuity argument of the KL divergence in which ensures the inequality persists after smoothing with . Indeed, for , we have , where the first term can be bounded using the Fannes-Audenaert inequality Fannes (1973); Audenaert (2007) and where the second term behaves as as long as is at most linear in . In this case, and , which ensures that (164) holds. Notably, while this is the case in most usual settings such as the one considered in the present paper, this continuity argument does not hold in general for arbitrary sequences of states and operators.
As a simple toy example, consider a two-level system with states , fix an inverse temperature , and let be a sequence of small positive nonzero reals with . We consider the sequence of states with and a sequence of Hamiltonians with . (The sequence is defined on a single copy of the Hilbert space; it is straightforward to embed these operators in , though perhaps not in a local and translation-invariant way.) The corresponding sequence of Gibbs weights is . We can calculate
[TABLE]
For the min divergence and for any we have
[TABLE]
and hence . On the other hand, for any we have that for large enough; then for large enough, D\bigl{(}\lvert{0}\rangle\langle{0}\rvert,\hat{\rho}_{n}\bigr{)}\leqslant\varepsilon and
[TABLE]
and hence . Finally, recalling (30), we find
[TABLE]
Crucially, the operator has an eigenvalue that is at least exponentially small in , and is superlinear in . This invalidates the usual continuity argument described above. Having with such a behavior amounts to having a Hamiltonian (such as in our example) with an energy level that scales superlinearly in . Physically, this means that the system does not have a sound thermodynamic limit; in practice, for instance in the case of all-to-all coupling, one prefers to normalize the full Hamiltonian to ensure a good behavior in the thermodynamic limit. Nevertheless, our toy example shows that in full generality, the min- and max-divergences can collapse to a single value and define a thermodynamic potential which does not coincide with the KL divergence in the thermodynamic limit.
We emphasize that this issue does not appear in usual settings such as the one considered in the present paper, where the energy is extensive. Also, this issue cannot appear with the spectral entropy rates (i.e., if ), because of the argument above, or alternatively, thanks to Lemma 3 of Ref. Bowen and Datta (2006b). In those cases, the Kullback-Leibler divergence (or the von Neumann entropy rate) is the relevant thermodynamic potential that emerges from the reversibility of the resource theory.
V Discussion
Our results provide new insight on the role of ergodicity and typicality in many-body systems Anshu (2016); Wilming et al. (2019). Our two main theorems on one hand advance our understanding of the possible interconversion of states with thermal operations and a limited source of coherence, and on the other hand establish a generalized quantum Stein’s lemma for lattice systems with local and translation-invariant Hamiltonians. Together, these theorems prove our main result, namely, that a thermodynamic potential emerges in the resource theory of thermal operations for all ergodic states in lattices with a translation-invariant local Hamiltonian.
Thermal operations involving nonsemiclassical states.
While the possible state transformations under thermal operations are well understood for semiclassical states thanks to the notion of thermomajorization Horodecki and Oppenheim (2013), the picture becomes significantly more involved if we consider states that present coherences between energy eigenspaces Gour et al. (2018); Lostaglio et al. (2015a). The min- and max-divergence no longer represent the distillable work and the work cost of formation of a state, because in general one requires a suitable reference frame to accurately carry out those transformations Bartlett et al. (2007); Korzekwa et al. (2016); Gour et al. (2018); Popescu et al. (2018). Our Theorem 2 shows, however, that if the two divergences coincide approximately, then the coherences that are present in the state are necessarily small in a suitable sense, such that these transformations become approximately possible after all with only a small reference frame. In the thermodynamic limit, the size of the reference frame becomes negligible.
Our theorem provides a conceptually clear characterization of which states can be reversibly converted to the thermal state, and hence, for which class of states the thermodynamic potential emerges. Namely, approximately reversible conversion to the thermal state is possible if and only if the min and max divergences coincide approximately (although the error terms have to be adjusted in each direction of the proof).
We resort to a crude metric for the amount of coherence that was used in a process: We allow the use of an ancilla whose Hamiltonian is suitably bounded. Recently, more refined methods of accounting for coherence have been introduced, such as via coherent work Mingo and Jennings (2019) or with a more traditional resource-theoretic approach Marvian (2020). Using an improved measure of coherence would allow to clarify the amount of coherence used in the processes of Theorem 2.
One could ask for a characterization of which classes of states can be reversibly converted into one another, without being necessarily reversibly convertible to the thermal state. Consider for instance two states with the same spectrum that is not uniform, both living within a fixed energy subspace: They can be related by an energy-conserving unitary, but they cannot be reversibly converted to the thermal state. In this paper, we have adopted the convention that a thermodynamic potential should be well defined for the thermal state itself. Curiously however, it is also possible to define some kind of “alternative thermodynamic potentials” for such classes of states which cannot include the thermal state. It is not clear to us what the physical relevance of such classes of states would be.
We also note that ergodic states have off-diagonal elements that vanish exponentially, similarly to the behavior encountered in states obeying the eigenstate thermalization hypothesis (Lemma 4 combined with Theorem 3). It is then natural to ask whether there are properties of states that obey the eigenstate thermalization hypothesis (such as error-correcting properties Brandão et al. (2019)) that can be carried over to ergodic states.
Asymptotic Equipartition, the Shannon-McMillan theorem, and Stein’s lemma
The classical Shannon-McMillan theorem along with its quantum counterparts provide a collection of AEP statements that play an important role in information theory, statistics, and statistical physics, where ergodic processes are naturally encountered. Because of the stark formal differences between the quantum and the classical definitions of Markovianity, the quantum versions of these AEP theorems do not follow directly from their classical counterparts. Building on earlier proofs of the quantum Shannon-McMillan theorem Bjelaković et al. (2004); Bjelaković and Szkola (2005); Ogata (2013) and a relative AEP theorem with respect to product states Bjelakovic and Siegmund-Schultze (2004), we finally provide the full quantum version of the classical relative AEP theorem mentioned above, which applies to ergodic states relative to Gibbs states of a local Hamiltonian.
A main component of the proof of our main result is a generalized version of Stein’s lemma which is tightly related to the proof techniques of Ref. Bjelakovic and Siegmund-Schultze (2003). Namely, it suffices to find an operator obeying a set of simple conditions to conclude that the min and max divergences collapse, which can be seen partly thanks to ideas from semidefinite programming Dupuis et al. (2013); Tomamichel and Hayashi (2013). By constructing suitable typical projectors using the ergodicity property of the state, our Theorem 3 exploits this characterization and provides a new version of Stein’s lemma. The latter applies to situations beyond i.i.d. states, since we may consider any ergodic state with respect to any Gibbs state that arises from a local Hamiltonian.
Crucially, the states we consider are spatially ergodic, rather than ergodic with respect to time evolution. Spatially ergodic states can have a nontrivial time evolution, even producing significant changes of macroscopic quantities in time Faist et al. (2019a). Importantly, this shows that one can define a thermodynamic potential that has a operational interpretation even for certain states that are not in thermodynamic equilibrium.
By endowing a new class of states with a rigorous, well-justified thermodynamic potential, one may ask whether or not it is possible to find even larger classes of states that can be reversibly converted into one another. Thanks to Lemma 7, the thermodynamic potential also emerges for all finite mixtures of ergodic states with the same thermodynamic potential. Whether there are more translation-invariant states that have a well-defined thermodynamic potential in the sense of the present paper is an open question.
One may ask whether or not our results could be generalized to systems that violate translation-invariance. It might be possible to treat a weak violation by adapting the present argument with a suitable control of the relevant error terms. For systems that are fundamentally not translation-invariant, one could instead ask whether ideas from entropy accumulation could be leveraged to prove bounds on the min and max spectral rates in the thermodynamic limit, using local properties of the state (or of the local process that generates the state) rather than symmetry considerations Dupuis et al. (2020); Dupuis and Fawzi (2019). Conversely, insights gained from the behavior of the spectral rates in statistical mechanical systems might provide new ways of proving more general entropy accumulation theorems which might involve the divergence, the mutual information, or a channel capacity.
A further natural extension of our work would be to lift our results from transformations of quantum states to transformations of quantum channels, in line of the results of Ref. Faist et al. (2019b). Can non-i.i.d. quantum channels that have a suitable ergodic property be reversibly converted into one another?
The quantum Shannon-McMillan theorem moreover holds in a more general and abstract operator algebra context Ogata (2013). We might expect that additional AEP results in such settings can be shown using ideas put forward in the present paper.
Finally, one could attempt to further characterize the min and max divergence rates in natural situations where they do not coincide. These quantities are known to bound any extension of the thermodynamic potential outside of the set of reversibly interconvertible states Lieb and Yngvason (2013), and as such, the interval provides a “best possible characterization” of the thermodynamic behavior of such states that takes into account the fluctuations in thermodynamic quantities that persist in the thermodynamic limit. We expect this to be the case, for instance, for many-body-localized states, or for states at critical points immediately before spontaneous symmetry breaking. The techniques put forward in the present paper might help derive bounds in such cases, which, while falling short of a collapse of the min and max divergences, would still provide a useful characterization for a greater class of states that are far out of equilibrium.
Acknowledgements.
The authors are grateful to Hiroyasu Tajima, Yoshiko Ogata and Matteo Lostaglio for valuable discussions. TS is supported by JSPS KAKENHI Grant Number JP16H02211 and JP19H05796. PhF is supported by the Institute for Quantum Information and Matter (IQIM) at Caltech which is a National Science Foundation (NSF) Physics Frontiers Center (NSF Grant PHY-1733907), from the Department of Energy Award DE-SC0018407, from the Swiss National Science Foundation (SNSF) via the NCCR QSIT and project No. 200020_165843, and from the Deutsche Forschungsgemeinschaft (DFG) Research Unit FOR 2724. KK is supported by the Institute for Quantum Information and Matter (IQIM) at Caltech which is a National Science Foundation (NSF) Physics Frontiers Center (NSF Grant PHY-1733907). FB is is supported by the NSF.
Appendix 0.A General technical lemmas
The following gentle measurement lemma states that a measurement effect that is almost certain to appear does not disturb the state much Winter (1999); Ogawa and Nagaoka (2007).
Proposition 10**.**
For a state and any operator with , if , then
[TABLE]
The following technical lemmas provide a few variations around the gentle measurement lemma, dealing with operators that capture most of the weight of a state.
Lemma 8**.**
Let and be projectors. Suppose that a state satisfies \operatorname{tr}\bigl{[}\hat{Q}\hat{\rho}\bigr{]}\geqslant 1-\varepsilon and \operatorname{tr}\bigl{[}\hat{Q}^{\prime}\hat{\rho}\bigr{]}\geqslant 1-\varepsilon^{\prime} for , . Then,
[TABLE]
Proof.
We first note that
[TABLE]
From the Schwarz inequality, we have
[TABLE]
where we used that and are projectors. Therefore, we obtain Eq. 170. ∎
Lemma 9**.**
Let be an operator with . Suppose that a subnormalized state satisfies \operatorname{Re}\bigl{(}\operatorname{tr}\bigl{[}\hat{W}\hat{\rho}\bigr{]}\bigr{)}\geqslant 1-\varepsilon with . Then, both following statements are true:
- (a)
\operatorname{tr}\bigl{[}\hat{W}^{\dagger}\hat{W}\hat{\rho}\bigr{]}\geqslant 1-2\varepsilon* and \operatorname{tr}\bigl{[}\hat{W}\hat{W}^{\dagger}\hat{\rho}\bigr{]}\geqslant 1-2\varepsilon ;* 2. (b)
\operatorname{tr}\bigl{[}(\hat{I}-\hat{W})(\hat{I}-\hat{W}^{\dagger})\hat{\rho}\bigr{]}\leqslant 2\varepsilon* .*
Proof.
- (a)
From the Cauchy-Schwarz inequality,
[TABLE]
We can show the second inequality in the same manner. 2. (b)
This follows from
[TABLE]
where we used . ∎
∎
Next we show that for a mixture of states, the min and max spectral rates are given by the smallest or largest spectral rate in the mixture, respectively.
Proposition 11**.**
Consider a sequence of states where each state is given by a mixture for a given probability distribution independent of , and consider the individual sequences . Then, the lower and the upper divergence rates of relative to a sequence of positive operators are given by
[TABLE]
This proposition immediately follows from the following three lemmas.
Lemma 10**.**
Consider a mixture of states with a probability distribution . Let be a quantum state such that . Then there exists a probability distribution and a collection of states such that
[TABLE]
Proof.
Call our system of interest , and consider a copy . Let be orthonormal bases of and , respectively, and let be the reference unnormalized maximally entangled state. Consider the following purification of ,
[TABLE]
Let be a register with an orthonormal basis and consider the following purification of ,
[TABLE]
From Uhlmann’s theorem, there exists a purification of such that
[TABLE]
Invoking the Fuchs-van de Graaf relations between the fidelity and the trace distance Fuchs and van de Graaf (1999); Nielsen and Chuang (2000), , we find that . Now, define
[TABLE]
From the monotonicity of the trace norm under CPTP maps, we have , where here the trace distance is calculated between the two classical probability distributions, which is known as the total variational distance. Furthermore, the trace norm cannot increase under any CP and trace-nonincreasing maps, and hence,
[TABLE]
This implies
[TABLE]
which completes the proof. ∎
Lemma 11**.**
Consider a mixture of states with a probability distribution . Let be such that for all . Then
[TABLE]
Proof.
We first show inequality (183). For each , there exists such that . Let , which is a candidate for minimization in , because , using the joint convexity of the trace distance. Then,
[TABLE]
We next show inequality (184). There exists such that . By the Fuchs-van de Graaf inequalities Fuchs and van de Graaf (1999); Nielsen and Chuang (2000), we have and thus . Let be quantum states and be a probability distribution that are given by Lemma 10, such that and . Noting that , we have
[TABLE]
which implies inequality (184). ∎
Lemma 12**.**
Consider a mixture of states with a probability distribution , and let . Then
[TABLE]
Proof.
We first show inequality (187). For each , there exists such that . Let , which is a candidate for maximization in . We note that , because the kernel of is larger than the intersection of the kernels of ’s. Therefore,
[TABLE]
We next show inequality (188). There exists such that . By the Fuchs-van de Graaf inequalities, we have as above. Let be states and be a probability distribution given by Lemma 10. For all ,
[TABLE]
and therefore
[TABLE]
which implies inequality (188). ∎
Appendix 0.B Properties of our thermodynamic framework and convertibility proof for Gibbs-preserving maps
In this section we derive a collection of useful properties of thermodynamic transformations that were introduced in Section III.1, and provide a simplified version of Theorem 1 that is specialized to Gibbs-preserving maps.
The partial isometry in the definition of a thermal operation commutes with the system-and-bath Hamiltonian in the following sense.
Proposition 12**.**
Let be systems with Hamiltonians and let be a partial isometry such that . Then
[TABLE]
In consequence, is a mapping of a subset of initial energy eigenstates on to some final energy eigenstates on .
Proof.
We compute directly and similarly . ∎
Now we show that any partial isometry that is compatible with the system Hamiltonian (i.e., one that maps the input Hamiltonian to the output Hamiltonian on the range of the partial isometry) can be dilated into a full energy-conserving unitary on a larger system from which the partial isometry is recovered by preparing an ancilla in a pure state and post-selecting on a specific measurement outcome of an ancilla on the output of the unitary. The present proof is partly adapted from (Faist et al., 2021, Proposition C.2).
Proposition 13** (Dilation of a partial energy-conserving isometry).**
Consider systems with Hamiltonians . Let be a partial isometry such that . Let be a system with Hamiltonian , and suppose that there exist nontrivial systems and with respective Hamiltonians , along with unitaries , such that and . Let be two eigenstates of and of the same energy . Then there exists a unitary such that and
[TABLE]
where is the projector onto the support of .
Furthermore, we can remove from (193) under the following additional assumption. Let be the energy eigenvalues with the corresponding multiplicities of all energy eigenstates of that are in the kernel of . Let be the energy eigenvalues with the corresponding multiplicities of all energy eigenstates of that have no overlap with , i.e., for which . Suppose that for each (for ), there exists a corresponding with and . Then there exists a unitary operator with and such that
[TABLE]
Before delving into the proof of Proposition 13 we issue a few remarks to provide a better picture of the consequences of this general proposition and to identify a few interesting special cases.
- (a)
The operator in (193) can be replaced by an operator acting after the unitaries, where is the projector onto the range of . 2. (b)
If and , we can choose with trivial systems . With this choice of , the projector in (193) is always necessary unless is already unitary. (The projector can be removed by choosing a larger system , see below.) 3. (c)
The additional assumption in the second part of the proposition amounts to requiring that, for the given , it is possible to map the support of (i.e., the space spanned by all eigenstates outside of the support of and tensored with ), into a space of the global output system such that the mapping is energy conserving and such that the resulting space has no overlap with . As long as the input state on is initialized in the state , then projecting the output onto automatically ensures that the input state already lies within the projector . (Equivalently, the projector on the input becomes redundant.) 4. (d)
For any and for a general choice of , , with corresponding Hamiltonians along with energy-preserving embedding unitaries , , there always exists a qubit system with some Hamiltonian such that there exist and with the same eigenenergy.
This statement is shown as follows. We first pick any two energy eigenstates and of respective energies and . We then introduce a qubit with the Hamiltonian , with and for any chosen constant . Define , , , etc., along with and , observing that and are both energy eigenstates with energy . 5. (e)
For any satisfying the first part of the proposition, we can always introduce a qubit system with a degenerate Hamiltonian for some arbitrary constant , and define , , , etc., along with and , such that the additional condition of the second part of the proposition is satisfied. Indeed, from the unitary given by the proposition without the extra qubit, we can define , i.e., conditionally flips the bit if the input on is in the support of , before applying . The effect of is to map all states of the form onto states with the system remaining in the state , ensuring that there is no overlap with . 6. (f)
The qubits introduced in Points (d) and (e) may evidently be chosen to be larger systems that contain such qubits as subspaces. 7. (g)
For any and for a general choice of , , with corresponding Hamiltonians along with energy-preserving embedding unitaries , , there might not always exist and with the same eigenenergy, even if . As a counterexample, consider systems where the system has energy levels , the system has levels , the system has levels , and the system is trivial with the single level . In both cases, the joint energy levels are , and can be nonzero by mapping the [math] level of to the [math] level of . Yet, and do not share an energy level of same energy. 8. (h)
For arbitrary , a simple choice for the system is with , , , , along with the trivial identity embedding maps , . There always exist and with the same eigenenergy (as long as ), by picking an eigenstate in the support of along with its associated image under .
Furthermore, with this choice it is always possible to satisfy our additional condition leading to (194). This can be seen as follows. Let . We choose energy eigenbases of and of , with spanning the support of and with for those . Then we choose and (assuming ), noting that they must have the same energy. We see that all states of the form for can be mapped onto themselves, with clearly because . 9. (i)
In the case of the generalized thermal operation depicted in Fig. 1, we have and , with a given energy-conserving partial isometry . In this case, we may choose , with and . If necessary, we can enlarge and to include qubit systems and/or as per Points (d) and (e) to ensure that all the conditions of Proposition 13 are satisfied. Then there exists , , along with an energy-conserving unitary , such that
[TABLE]
We now turn to the proof of the proposition.
Proof of Proposition 13..
First we compute as in Proposition 12 the commutators and , as well as and .
Because and are unitary we must have . Also, the operator is an energy-conserving unitary operator from to ; therefore, the Hamiltonians and must have the same eigenvalues and with the same multiplicity.
Let , noting that is a partial isometry. Furthermore, , recalling that and have the same eigenvalue with respect to and , respectively; therefore .
We can complete into a fully energy-conserving unitary by assigning to each input energy eigenstate an energy eigenstate of same energy at the output; this association is possible since the eigenvalues of the input and output systems coincide including with multiplicity. Then (193) is satisfied by construction, as can be checked by verifying the action of both sides of the equation on an energy eigenbasis spanning the support of .
Now we assume that the additional condition stated in the claim holds, in order to prove (194).
Let be a basis of that is a simultaneous eigenbasis of , , and , and furthermore chosen such that (i) the states span the support of , and (ii) the set spans the subspace supported by \hat{U}^{\prime}_{K\bar{K}\to M}\,\bigl{(}(\hat{I}_{K}-\hat{\Pi}_{K})\otimes\lvert{\mathrm{i}}\rangle\langle{\mathrm{i}}\rvert_{\bar{K}}\bigr{)}\,\hat{U}_{K\bar{K}\leftarrow M}^{\prime\dagger}.
Let be another basis of that is a simultaneous eigenbasis of and , and furthermore chosen such that (i) we have for all , (ii) we have that the set is orthogonal to and (iii) we also have that for is an energy eigenstate with the same energy as , which we can ensure thanks to our additional assumption stated in the claim.
Then we define as
[TABLE]
The operator is unitary and commutes with , since it maps an energy eigenbasis onto an energy eigenbasis. If the state is in the support of , we have that for suitable complex coefficients . Then
[TABLE]
If the state lies outside the support of , we have that for suitable complex coefficients , and
[TABLE]
We have therefore proven (194). ∎
Now we present some general properties of the thermodynamic operations introduced in Section III.1.
Proposition 14** (Elementary properties of thermodynamic operations).**
Consider systems with corresponding Hamiltonians . Let denote either TO or GPM. The following hold:
- (a)
If and for some , the identity process is a -work/coherence-assisted process in either model TO or GPM; 2. (b)
For two energy eigenstates , we have if and only if ; 3. (c)
For any , we have for energy eigenstates on ancillas if and only if ; 4. (d)
We have , where , with , ; 5. (e)
* implies for any , and ;* 6. (f)
If and , then .
Proof.
Property (a) for is obvious because the identity process is itself both a thermal operation and a Gibbs preserving map. For with we use a two-level battery with energy eigenstates and ; then is an energy-conserving partial isometry, and thus a thermal operation, on the system and the battery with work expended. The statement in the GPM model follows from Lemma 1. Property (b) is clear; the only nontrivial aspect is that we may have strict inequality. That a thermal operation can perform this transformation can be seen using thermo-majorization Horodecki and Oppenheim (2013). The statement for GPM follows because a thermal operation is also Gibbs-preserving. Property (c) holds by definition of a -work/coherence-assisted process; the systems may be combined together with the battery system in the transformation. Property (d) holds because the thermo-majorization curve of the thermal state is the line connecting to Horodecki and Oppenheim (2013). Property (e) follows from (b). To show Property (f), let (respectively ) be a work/coherence-assisted-process with parameters (respectively ). Then is a -work/coherence-assisted process, and we have . ∎
Now we present the proofs of Propositions 3 and 4 stated in Section III.1 regarding the monotonicity of the various divergences under thermodynamic operations.
Proof of Proposition 3.
We have (invoking Lemma 1 if necessary); let be the corresponding Gibbs-sub-preserving map. The monotonicity of the hypothesis testing divergence follows directly from the properties (19) and (21).
The monotonicity of the Rényi divergences is trickier to prove because the corresponding data processing inequality only holds for trace-preserving mappings. Using (Faist and Renner, 2018, Proposition 2), there exists a qubit system with a basis and with a Hamiltonian , as well as eigenstates of , and a trace-preserving map such that
[TABLE]
Since , we can invoke (Faist and Renner, 2018, Corollary 3(b)) to see that
[TABLE]
Also, using (Faist and Renner, 2018, Proposition 17) and (199c), we have that
[TABLE]
Then using the property (14) of the Rényi -entropies and the above identities, we have
[TABLE]
where the inequality holds by the data processing inequality (10). ∎
Proof
of Proposition 4.
We prove the statement for the GPM model, invoking Lemma 1 if necessary. Let be systems with Hamiltonians from Definition 5 and let be the GPM operation in (38). Let , with D\bigl{(}\langle{E^{\prime},\zeta^{\prime}}\rvert_{W^{\prime}C^{\prime}}\hat{\tilde{\rho}}_{S^{\prime}C^{\prime}W^{\prime}}\lvert{E^{\prime},\zeta^{\prime}}\rangle_{W^{\prime}C^{\prime}}\,,\,\hat{\rho}^{\prime}_{S^{\prime}}\bigr{)}\leqslant\varepsilon. Using property (22) we have
[TABLE]
Now compute
[TABLE]
because where denotes the smallest eigenvalue of its argument. Observe that the operation \operatorname{tr}_{C^{\prime}W^{\prime}}\bigl{[}\lvert{E^{\prime},\zeta^{\prime}}\rangle\langle{E^{\prime},\zeta^{\prime}}\rvert_{W^{\prime}C^{\prime}}\,(\cdot)\bigr{]} is a completely positive, trace-nonincreasing map. Then thanks to (19) and (204) along with the scaling property (20),
[TABLE]
where the two last inequalities hold using respectively (21) noting that is Gibbs-sub-preserving, and the data processing inequality (19).
Let with be an optimal choice for the last divergence term in (205), such that . Let , noting that . Then we have \operatorname{tr}(\hat{Q}^{\prime}_{S}\hat{\rho}_{S})=\operatorname{tr}\bigl{(}\hat{Q}_{SCW}\,(\hat{\rho}_{S}\otimes\lvert{E,\zeta}\rangle\langle{E,\zeta}\rvert_{WC})\bigr{)}\geqslant\xi, and thus
[TABLE]
where in the last inequality we used and which imply together that . Rewriting (206), we have
[TABLE]
and finally,
[TABLE]
Following the chain of inequalities proves the claim. ∎
We present a convenient lemma that can ensure asymptotic convertibility if good enough asymptotic convertibility can be achieved for any fixed . We first note that, thanks to Property (e) of Proposition 14, we may equivalently replace all limits “” in Definition 7 by “”.
Lemma 13**.**
For sequences of states , and sequences of Hamiltonians , , suppose that for all there exists such that for all , where denotes TO or GPM. If is such that
[TABLE]
then .
Proof.
Let , , and . Define
[TABLE]
Now let and observe that because is finite for any small thanks to the existence of the limit superior defining , and . Then let , , and , such that for all . We have by definition of and hence . Similarly, and thus . Also, and thus . ∎
An important known result is the fact that the min and max divergences quantify the amount of work that is necessary to convert a semiclassical state to and from the thermal state.
Proposition 15** **(Work distillation and state formation for semiclassical
states Åberg (2013); Horodecki and Oppenheim (2013)).
Let be a quantum state on a system with Hamiltonian , and suppose that . Let denote the trivial thermal state on the trivial system with the trivial Hamiltonian . Then
[TABLE]
We now present a central proposition of this appendix, namely a simplified form of Theorem 1 that is specific to Gibbs-preserving maps. The error terms as well as the proof itself are significantly simpler than the full result for thermal operations.
Proposition 16** **(Work distillation and state
formation Horodecki and Oppenheim (2013); Faist and Renner (2018)).
Let be a quantum state on a system with a Hamiltonian . Let denote the trivial thermal state on the trivial system with the trivial Hamiltonian . Then for any we have
[TABLE]
Consequently, for any , and for any Hamiltonians ,
[TABLE]
For asymptotic sequences of states , and sequences of Hamiltonians , , we have
[TABLE]
where we denote by (respectively ) the sequence (respectively ).
Proof.
The statements (212) are proven in Ref. Faist and Renner (2018). The result for semiclassical states and thermal operations was shown in the earlier Ref. Horodecki and Oppenheim (2013). The statement (213) follows directly by combining the processes in (212). To prove (214), observe that for any , we have for sufficiently large that where is some function of with as . Then (214) follows from (213) and Lemma 13.∎
For completeness, we prove (213) directly with an explicit transformation (see also Theorem 6.3 of Sagawa (2021)).
Alternative direct proof
of (213).
We prove the following equivalent statement: Assuming that , we explicitly construct a Gibbs-preserving operation that performs the given transformation using a hypothesis test. The equivalence with (213) follows from Proposition 14 (c), the scaling property (13) of the divergences, and their additivity under tensor products (14). Without loss of generality we may assume that ; otherwise, shift the Hamiltonians by suitable constants and apply Proposition 14 (a) whose cost cancels the shift (13). Let and , which are now quantum states.
First, consider the case of . We explicitly construct a CPTP map that maps to , by using a “measure-and-prepare” method. Let , and let be the projection onto the support of . If , the situation becomes trivial, because and . If , we can construct the desired CPTP map as
[TABLE]
where the condition is used to guarantee that .
We next consider the case of . By definition of the smooth entropies, there exist such that and , with , . From the case we have that with respect to the thermal states . By triangle inequality and because quantum operations can only decrease the trace distance, we have that . Hence . ∎
As an immediate consequence, any state that satisfies can be reversibly converted to and from the thermal state with Gibbs-preserving operations. The same holds for thermal operations if the state is semiclassical. Consequently, the common value of the divergences, which we can denote as , is the thermodynamic potential: It characterizes exactly which state transformations are possible within this class of states.
Appendix 0.C -algebra formulation
In this appendix, we provide an overview of the standard formulation of ergodicity with -algebras Bratteli and Robinson (1987, 1981); Ruelle (1999), and prove that it is equivalent to our formulation in Section IV. Furthermore, we prove Theorem 3 in the alternative setting where we consider a sequence of reduced states of the infinite Gibbs state, rather than a sequence of finite Gibbs states corresponding to Hamiltonians truncated to finite regions. In the following, we use the notation of Section IV.
The set of local operators is given by for a bounded lattice region . Then, the -algebra is defined as the -inductive limit of , which is often written as .
We consider a (normal) state , where is interpreted as the expectation value of observable . We consider a reduced state to a bounded region . By definition, the reduced density operator on this region, written as , satisfies
[TABLE]
for all . We note that the consistency condition (138) is automatically satisfied for this .
By using the shift superoparator introduced in Section IV.2, we first define translation invariance.
Definition 11** (Translation invariance).**
A state is translation invariant, if for all and for all ,
[TABLE]
The above definition of translation invariance is equivalent to the definition in Section IV.2; this is guaranteed by the following lemma, which states that it is sufficient to take above to be local.
Lemma 14**.**
If Eq. 217 is satisfied for all and all , then is translation invariant.
Proof.
Suppose that Eq. 217 is satisfied for all . For any , there exists a sequence such that and . Let . Then we have
[TABLE]
The first term on the right-hand side vanishes. The second term is bounded as
[TABLE]
which goes to zero as . ∎
We now define ergodicity in a more standard and mathematically elegant way Bratteli and Robinson (1987); Ruelle (1999) (see also Refs. Bjelaković et al. (2004); Bjelaković and Szkola (2005)).
Definition 12** (Ergodicity).**
A state is translation-invariant and ergodic, if it is an extremal point of the set of translation-invariant states.
Physically, an ergodic state corresponds to a “pure thermodynamic phase” without phase mixture, which is consistent with this mathematical definition.
The following theorem establishes the equivalence of the definition above with the definition presented in Section IV.2. This is a reformulation of Theorem 6.3.3, Proposition 6.3.5, and Lemma 6.5.1 of Ref. Ruelle (1999); see also Ref. Bjelaković and Szkola (2005).
Lemma 15**.**
Using the notation of Section IV, the following are equivalent for any translation-invariant state :
- (a)
* is ergodic;* 2. (b)
For all self-adjoint ,
[TABLE] 3. (c)
For all ,
[TABLE] 4. (d)
Equation 220* is satisfied for all self-adjoint .*
For completeness, we prove the equivalence of (d) with the other points.
Proof.
It suffices to check that (d)(b). The proof is similar to that of Lemma 14, and we use the same notation: For any , there exists a sequence such that and ; let . Now suppose that Eq. 220 is satisfied for all self-adjoint . We first note that
[TABLE]
We then have
[TABLE]
From Eq. 220 for , we have, for a fixed ,
[TABLE]
Since can be taken arbitrarily large, the right-hand side above can be arbitrarily small. Therefore, Eq. 220 is satisfied for all . ∎
We now provide a definition of mixing that is suited to the formalism in this section.
Definition 13** (Mixing).**
Let be the shift operator in Definition 10 in Section IV. A state has the mixing property, if for all and all ,
[TABLE]
Definition 14** (Weak mixing).**
A state has the weak mixing property, if for all ,
[TABLE]
Mixing implies weak mixing, and weak mixing implies ergodicity. However, the converses of them are not true. In particular, the weak mixing in the above sense should not be confused with Eq. 221.
The following lemma guarantees that the above definition of mixing is equivalent to Definition 10 in Section IV.
Lemma 16**.**
In the definitions of mixing and weak mixing above, it is sufficient to take .
Proof.
The proof of (d)(b) in Lemma 15 provided above can be straightforwardly adapted to prove this lemma. ∎
We next consider the concept of local Gibbs states for the infinite-dimensional setup Bratteli and Robinson (1981). We here assume that the Kubo-Martin-Schwinger (KMS) state is unique at , which physically implies no phase coexistence. This is provable for any in one dimension Araki (1975), but is true at a sufficiently high temperature in higher dimensions Bratteli and Robinson (1981).
Let be the Gibbs state corresponding to the truncated Hamiltonian associated with the region , and represented by the density operator in Eq. 143 of Section IV.2. Then, it is known that a state
[TABLE]
exists, where the limit is given by the weak- (or ultraweak) topology of the dual of (cf. Proposition 6.2.15 of Ref. Bratteli and Robinson (1981)). We can then define the global Gibbs state on the entire lattice by . This global state satisfies the following condition for any ,
[TABLE]
Then, we define the reduced state of on a bounded region , which is written as . Let be the corresponding reduced density operator. For any observable , we have
[TABLE]
In the following, let be the sequence of the reduced Gibbs states, where and . We note that the reduced state and the truncated state are different in general, where only satisfies the consistency condition (138).
We now prove another version of Theorem 3 in Section IV, where is the sequence of reduced states of the full Gibbs state on the infinite lattice, instead of the sequence of Gibbs states corresponding to truncated Hamiltonians associated with a sequence of finite regions.
Our proof strategy is to show that the asymptotic min divergence rate, the max divergence rate and the KL divergence rate remain unchanged if we substitute by . For this, we invoke the following result, given as Theorem 3.11 in Ref. Lenci and Rey-Bellet (2005) (see in particular the second proof provided in that reference, which holds for observables that are not necessarily positive and proves the uniformity of the convergence).
Proposition 17** (Lenci and Rey-Bellet (Lenci and Rey-Bellet, 2005, Theorem 3.11)).**
Suppose that the KMS state is unique. For any observable for a bounded region , we have
[TABLE]
where the convergence is uniform in .
The above result allows us to prove that the KL divergence rate does not change if we replace the Gibbs state of the truncated Hamiltonian by the reduced state of the infinite Gibbs state.
Lemma 17**.**
Suppose that the KMS state is unique and that exists. Then exists and equals .
Proof.
Proposition 17 implies that
[TABLE]
which implies . ∎
Similarly, we may use Proposition 17 to show that the min and max divergence rates (via the hypothesis testing divergence rate) remain unchanged if we replace by .
Lemma 18**.**
Suppose that the KMS state is unique and that exists for any . Then, for any , the rate exists and equals .
Proof.
From Eq. 230 in Proposition 17, there exists satisfying such that for any ,
[TABLE]
Combined with Eq. 18, this implies that
[TABLE]
The claim follows by dividing this equation by and taking the limit . ∎
It is now straightforward to combine Lemmas 17 and 18 to prove another version of Theorem 3 for the infinite Gibbs state, rather than the limit of Gibbs states of the truncated Hamiltonian of increasingly large finite regions.
Theorem 4** (Collapse of the spectral rates for the reduced Gibbs state).**
Suppose that is translation invariant and ergodic, and is the reduced Gibbs state of a local and translation invariant Hamiltonian in any dimensions, where the KMS state is unique. Then, for any ,
[TABLE]
and as a consequence,
[TABLE]
Appendix 0.D An alternative proof of Theorem 4
Here we provide an alternative proof of Theorem 4 presented above, in the case of a one-dimensional chain, by combining a known result by Hiai, Mosonyi, and Ogawa Hiai et al. (2007) with the ergodic theorem of Bjelaković Bjelakovic and Siegmund-Schultze (2004). We state these results here:
Proposition 18** (Hiai, Mosonyi, and Ogawa (Hiai et al., 2007, Lemma 4.2)).**
Let be the reduced local Gibbs state on sites in one dimension. There exist and such that for all and we have
[TABLE]
Proposition 19** **(Bjelaković and
Siegmund-Schultze (Bjelakovic and Siegmund-Schultze, 2004, Theorem 2.1)).
Suppose that is translation-invariant and ergodic, and is i.i.d. Then, for any ,
[TABLE]
The proof strategy is thus to use Proposition 18 to reduce the problem for a local Gibbs state to a problem with a tensor product Gibbs state, by coarse-graining the -site chain into blocks of sites. The problem then falls in the scope of Proposition 19 which gives the desired result.
Alternative proof of Theorem 4 in one
dimension..
We fix , and let with . First we argue that we can essentially ignore the remaining sites and focus on the sites. From the monotonicity of the hypothesis testing divergence under CPTP maps, and therefore under the partial trace, we have for any ,
[TABLE]
Fix and let denote an optimal operator in (18) such that \eta^{-1}\operatorname{tr}\bigl{(}\hat{Q}_{km}\hat{\sigma}_{km}\bigr{)}=\exp\bigl{(}-{S}_{\mathrm{H}}^{\eta}(\hat{\rho}_{km}\,\|\,\hat{\sigma}_{m}^{\otimes k})\bigr{)}. Then, from Proposition 18,
[TABLE]
Therefore,
[TABLE]
From Proposition 19, we have for large and at fixed ,
[TABLE]
where . Using the fact that the logarithm is an operator monotone and with (236),
[TABLE]
Hence, we obtain
[TABLE]
Taking while fixing , we obtain
[TABLE]
where we used Lemma 17 to get the first term on the right-hand side. Since can be taken arbitrarily large, we obtain
[TABLE]
We next show the opposite direction. Again from the monotonicity of the hypothesis testing divergence under partial trace,
[TABLE]
Fix and let denote an optimal operator in (18) such that \eta^{-1}\operatorname{tr}\bigl{(}\hat{Q}_{(k+1)m}\hat{\sigma}_{(k+1)m}\bigr{)}=\exp\bigl{(}-{S}_{\mathrm{H}}^{\eta}(\hat{\rho}_{(k+1)m}\,\|\,\hat{\sigma}_{(k+1)m})\bigr{)}. Then, using Proposition 18,
[TABLE]
Therefore,
[TABLE]
From Proposition 19, we have for large and for fixed ,
[TABLE]
where . Since the logarithm is an operator monotone, we have from inequality (236),
[TABLE]
Therefore, we obtain
[TABLE]
By taking while fixing , we obtain
[TABLE]
where we again used Lemma 17. Since can be taken arbitrarily large, we obtain
[TABLE]
Equation 234 then follows from inequalities (245) and (253). ∎
Appendix 0.E The classical case
If we restrict the -algebra in one dimension to a commutative subalgebra, we obtain a classical stochastic process. Here, we flesh out explicitly the classical ergodic theorem that our argument in Section IV and Appendix 0.C reduces to in the classical case.
The classical counterpart of the setup in these sections is a two-sided stochastic process over with finite alphabets. Let be the stochastic process, where with being a finite set of alphabets, and let with . We consider sequences of probability distributions and .
First of all, we briefly comment on mathematical details about the correspondence between the classical case and the quantum case (see also Refs. Bjelaković et al. (2004); Bjelakovic and Siegmund-Schultze (2004)). Let be the -algebra of an infinite spin chain. We consider a unital Abelian -subalgebra , which is interpreted as a set of classical observables. Let be a quantum state on , and be its restriction to . From the Gelfand-Naimark theorem, is identified with the Banach space , which is the space of -valued continuous functions on a compact Hausdorff space . In our setup, , which is compact from the Tychonoff’s theorem. From the Riesz-Markov-Kakutani representation theorem, the dual of is the space of regular Borel measures on . Thus is identified with a probability measure on (i.e., a stochastic process over ).
Classical ergodicity can be defined in the same manner as in the quantum case (Definition 9), i.e., as a commutative case of quantum ergodicity. On the other hand, the standard definition of classical ergodicity is that any subset of trajectories in a stochastic process that is invariant under has measure [math] or . These definitions are equivalent for the finite-alphabet case. In fact, a classical stochastic process is translation-invariant ergodic if and only if it is an extremal point of the set of translation-invariant processes. Also, as mentioned before, Definition 9 is equivalent to the definition by extremality for quantum spin systems Ruelle (1999); Bjelaković and Szkola (2005).
All the quantum divergences introduced in Section II can be computed using as arguments a probability distribution and a vector of positive entries of same length, by embedding both classical vectors into the diagonal entries of an operator in a Hilbert space whose dimension is the same as the number of entries in the vectors.
In the following, we argue that, explicitly for the classical case, if and satisfy a relative asymptotic equipartition property (relative AEP), then the lower and the upper divergence rates coincide, and they must equal the KL divergence. We first define the relative AEP in the form of a convergence in probability, a classical counterpart of our quantum formulation in Section IV.
Definition 15** (Relative asymptotic equipartition property (relative AEP)).**
We say that and satisfy the relative AEP if the KL divergence rate exists and if converges to in probability by sampling according to .
This is equivalently formulated as follows (see, for example, Theorem 11.8.2 of Ref. Cover and Thomas (2006)):
Proposition 20**.**
Suppose that exists. The sequences and satisfy the relative AEP if and only if for any , there exists a set (the relative typical set) such that for sufficiently large :
- (a)
For any ,
[TABLE] 2. (b)
; and 3. (c)
.
Here, and represent the probability of according to distributions and , respectively.
The relative AEP ensures that the min and max divergence rates converge to the KL divergence rate:
Proposition 21**.**
If and satisfy the relative AEP, we have
[TABLE]
Proof.
Although this proposition follows easily from Eqs. 29a and 29b, we here note an alternative proof based on Definition 2 with a slightly different intuition. We consider a subnormalized probability distribution defined by for and for . From and with Proposition 10, we see that is a candidate for the maximization in . Therefore,
[TABLE]
where we used that cannot be smaller than the support of to obtain the right inequality. From the right inequality of (c) in Proposition 20, we have
[TABLE]
By taking the limit , we obtain
[TABLE]
Similarly, we have
[TABLE]
From the right hand side of Proposition 20 (a), we have
[TABLE]
By taking the limit, we obtain
[TABLE]
By combining Eqs. 258 and 261, we obtain (255). ∎
In the following, we assume that is translation-invariant (i.e., stationary) and ergodic. In this case the non-relative AEP (i.e., the classical counterpart of Proposition 8) is satisfied, as a consequence of the Shannon-McMillan theorem.
As in the quantum case, we define the reduced state of the global Gibbs state of a local and translation-invariant Hamiltonian in one dimension, where (i.e., is a marginal distribution of ). We can also define the truncated Gibbs state . The global Gibbs state is obtained as the limit of the truncated Gibbs states Ruelle (1968):
[TABLE]
where convergence is given by the weak- topology (or the vague topology) of the dual of the Banach space .
We remark that the case of the reduced Gibbs state can also be obtained from a well-known fact that the relative AEP is satisfied for a translation-invariant ergodic process with respect to a translation-invariant Markov process. (The relative AEP has also been proved in a stronger sense (i.e., almost surely convergence). See Ref. Algoet and Cover (1988) and references therein. For our purpose here, however, convergence in probability is enough.) In fact, we have the following lemma.
Lemma 19**.**
The global Gibbs state of a local and translation-invariant Hamiltonian in one dimension is translation-invariant Markovian.
Proof.
From the Hammersley-Clifford theorem Hammersley and Clifford (1971) (see also Ref. Kato and Brandão (2019)), it is known that the Gibbs state of a local Hamiltonian on an arbitrary finite graph is Markovian. On the other hand, here we directly prove this lemma by explicitly calculating the global Gibbs distribution , without using the Hammersley-Clifford theorem.
For simplicity, we assume that the local interaction is given in the form of and satisfies . We introduce the transfer matrix , whose -element is given by
[TABLE]
Here, we used the bra-ket notation to represent the classical probability vectors. We denote the spectral decomposition of as
[TABLE]
We also assume that has a non-degenerate maximum eigenvalue .
For the truncated Hamiltonian , the truncated Gibbs distribution is given by
[TABLE]
where is the column vector whose entries are all unity. Its marginal distribution for an interval with , is given by
[TABLE]
The conditional probability is then given by
[TABLE]
which depends only on and — as expected from the Hammersley-Clifford theorem — with also an explicit dependency on . From (264),
[TABLE]
By taking the limit of while fixing and , we obtain
[TABLE]
where the right-hand side depends only on and and no longer explicitly depends on . Therefore, the global Gibbs distribution satisfies
[TABLE]
We note that it is straightforward to remove the assumption that has a non-degenerate maximum eigenvalue. In fact, we can just replace the right-hand side of (269) by multiple eigenvectors with the maximum eigenvalue of .
In general, a stochastic process is defined as Markovian, if for any
[TABLE]
holds almost surely (see, for example, Chapter 2 of Ref. Doob (1990)). Also, from the Levy’s martingale convergence theorem, holds almost surely. The claim then follows from Eq. 270. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Callen (1985) H. B. Callen, Thermodynamics and an Introduction to Thermostatistics (Wiley, 1985).
- 2Lieb and Yngvason (1999) E. H. Lieb and J. Yngvason, The physics and mathematics of the second law of thermodynamics, Physics Reports 310 , 1 (1999) , ar Xiv:cond-mat/9708200 . · doi ↗
- 3Sagawa (2012) T. Sagawa, Second law-like inequalities with quantum relative entropy: An introduction, in Lectures on Quantum Computing, Thermodynamics and Statistical Physics , edited by M. Nakahara and S. Tanaka (World Scientific, 2012) pp. 125–190, ar Xiv:1202.0983 . · doi ↗
- 4Parrondo et al. (2015) J. M. R. Parrondo, J. M. Horowitz, and T. Sagawa, Thermodynamics of information, Nature Physics 11 , 131 (2015) , ar Xiv:0903.2792 . · doi ↗
- 5Goold et al. (2016) J. Goold, M. Huber, A. Riera, L. d. Rio, and P. Skrzypczyk, The role of quantum information in thermodynamics—a topical review, Journal of Physics A: Mathematical and Theoretical 49 , 143001 (2016) , ar Xiv:1505.07835 . · doi ↗
- 6Chitambar and Gour (2019) E. Chitambar and G. Gour, Quantum resource theories, Reviews of Modern Physics 91 , 025001 (2019) , ar Xiv:1806.06107 . · doi ↗
- 7Sagawa (2021) T. Sagawa, Entropy, Divergence, and Majorization in Classical and Quantum Thermodynamics (Springer Briefs in Mathematical Physics, 2021) in press, ar Xiv:2007.09974 .
- 8Brandão et al. (2013) F. G. S. L. Brandão, M. Horodecki, J. Oppenheim, J. M. Renes, and R. W. Spekkens, Resource theory of quantum states out of thermal equilibrium, Physical Review Letters 111 , 250404 (2013) , ar Xiv:1111.3882 . · doi ↗
