Asymptotic Reversibility of Thermal Operations for Interacting Quantum   Spin Systems via Generalized Quantum Stein's Lemma

Takahiro Sagawa; Philippe Faist; Kohtaro Kato; Keiji Matsumoto,; Hiroshi Nagaoka; Fernando G. S. L. Brandao

arXiv:1907.05650·quant-ph·November 23, 2021

Asymptotic Reversibility of Thermal Operations for Interacting Quantum Spin Systems via Generalized Quantum Stein's Lemma

Takahiro Sagawa, Philippe Faist, Kohtaro Kato, Keiji Matsumoto,, Hiroshi Nagaoka, Fernando G. S. L. Brandao

PDF

TL;DR

This paper demonstrates that for translation-invariant ergodic quantum spin systems, the asymptotic convertibility of states under thermal operations is fully characterized by the KL divergence rate, extending quantum Stein's lemma.

Contribution

It generalizes quantum Stein's lemma to non-i.i.d. ergodic states, establishing KL divergence rate as a thermodynamic potential for quantum many-body systems.

Findings

01

KL divergence rate characterizes state convertibility

02

Extension of quantum Stein's lemma beyond i.i.d. cases

03

Reversible conversion of states with quantum coherence

Abstract

For quantum spin systems in any spatial dimension with a local, translation-invariant Hamiltonian, we prove that asymptotic state convertibility from a quantum state to another one by a thermodynamically feasible class of quantum dynamics, called thermal operations, is completely characterized by the Kullback-Leibler (KL) divergence rate, if the state is translation-invariant and spatially ergodic. Our proof consists of two parts and is phrased in terms of a branch of the quantum information theory called the resource theory. First, we prove that any states, for which the min and max R\'enyi divergences collapse approximately to a single value, can be approximately reversibly converted into one another by thermal operations with the aid of a small source of quantum coherence. Second, we prove that these divergences collapse asymptotically to the KL divergence rate for any…

Equations580

D (\overset{ρ}{^}, \overset{ρ}{^}^{'}) = \frac{1}{2} ∥ \overset{ρ}{^} - \overset{ρ}{^}^{'} ∥_{1} + \frac{1}{2} ∣ tr (\overset{ρ}{^}) - tr (\overset{ρ}{^}^{'}) ∣ .

D (\overset{ρ}{^}, \overset{ρ}{^}^{'}) = \frac{1}{2} ∥ \overset{ρ}{^} - \overset{ρ}{^}^{'} ∥_{1} + \frac{1}{2} ∣ tr (\overset{ρ}{^}) - tr (\overset{ρ}{^}^{'}) ∣ .

S_{1} (\overset{ρ}{^} ∥ \overset{σ}{^}) = tr [\overset{ρ}{^} ln \overset{ρ}{^} - \overset{ρ}{^} ln \overset{σ}{^}] .

S_{1} (\overset{ρ}{^} ∥ \overset{σ}{^}) = tr [\overset{ρ}{^} ln \overset{ρ}{^} - \overset{ρ}{^} ln \overset{σ}{^}] .

S_{0} (\overset{ρ}{^} ∥ \overset{σ}{^}) = - ln tr [\hat{P}_{ρ} \overset{σ}{^}],

S_{0} (\overset{ρ}{^} ∥ \overset{σ}{^}) = - ln tr [\hat{P}_{ρ} \overset{σ}{^}],

\displaystyle{S}_{1/2}(\hat{\rho}\,\|\,\hat{\sigma})=-\ln\,\bigl{\lVert}{\hat{\rho}^{1/2}\hat{\sigma}^{1/2}}\bigr{\rVert}_{1}^{2}\ ,

\displaystyle{S}_{1/2}(\hat{\rho}\,\|\,\hat{\sigma})=-\ln\,\bigl{\lVert}{\hat{\rho}^{1/2}\hat{\sigma}^{1/2}}\bigr{\rVert}_{1}^{2}\ ,

\displaystyle{S}_{\infty}(\hat{\rho}\,\|\,\hat{\sigma})=\ln\,\bigl{\lVert}{\hat{\sigma}^{-1/2}\,\hat{\rho}\,\hat{\sigma}^{-1/2}}\bigr{\rVert}_{\infty}=\ln\min_{\hat{\rho}\leqslant\lambda\hat{\sigma}}\lambda\ ,

\displaystyle{S}_{\infty}(\hat{\rho}\,\|\,\hat{\sigma})=\ln\,\bigl{\lVert}{\hat{\sigma}^{-1/2}\,\hat{\rho}\,\hat{\sigma}^{-1/2}}\bigr{\rVert}_{\infty}=\ln\min_{\hat{\rho}\leqslant\lambda\hat{\sigma}}\lambda\ ,

- ln tr (\overset{σ}{^}) ⩽ S_{0} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩽ S_{1/2} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩽ S_{1} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩽ S_{\infty} (\overset{ρ}{^} ∥ \overset{σ}{^}) .

- ln tr (\overset{σ}{^}) ⩽ S_{0} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩽ S_{1/2} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩽ S_{1} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩽ S_{\infty} (\overset{ρ}{^} ∥ \overset{σ}{^}) .

S_{α} (\overset{ρ}{^}) := - S_{α} (\overset{ρ}{^} ∥ \hat{I}) .

S_{α} (\overset{ρ}{^}) := - S_{α} (\overset{ρ}{^} ∥ \hat{I}) .

S_{1} (\overset{ρ}{^})

S_{1} (\overset{ρ}{^})

0 ⩽ S_{\infty} (\overset{ρ}{^}) ⩽ S_{1} (\overset{ρ}{^}) ⩽ S_{0} (\overset{ρ}{^}) ⩽ ln (D) .

0 ⩽ S_{\infty} (\overset{ρ}{^}) ⩽ S_{1} (\overset{ρ}{^}) ⩽ S_{0} (\overset{ρ}{^}) ⩽ ln (D) .

S_{α} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩾ S_{α} (E (\overset{ρ}{^}) ∥ E (\overset{σ}{^})) .

S_{α} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩾ S_{α} (E (\overset{ρ}{^}) ∥ E (\overset{σ}{^})) .

S_{α} (\overset{ρ}{^}) ⩽ S_{α} (E (\overset{ρ}{^})) .

S_{α} (\overset{ρ}{^}) ⩽ S_{α} (E (\overset{ρ}{^})) .

S_{α} (\overset{ρ}{^} ∥ \overset{σ}{^}^{'}) ⩽ S_{α} (\overset{ρ}{^} ∥ \overset{σ}{^}) .

S_{α} (\overset{ρ}{^} ∥ \overset{σ}{^}^{'}) ⩽ S_{α} (\overset{ρ}{^} ∥ \overset{σ}{^}) .

S_{α} (\overset{ρ}{^} ∥ a \overset{σ}{^}) = S_{α} (\overset{ρ}{^} ∥ \overset{σ}{^}) - ln (a) .

S_{α} (\overset{ρ}{^} ∥ a \overset{σ}{^}) = S_{α} (\overset{ρ}{^} ∥ \overset{σ}{^}) - ln (a) .

S_{α} (\overset{ρ}{^} \otimes \overset{ρ}{^}^{'} ∥ \overset{σ}{^} \otimes \overset{σ}{^}^{'}) = S_{α} (\overset{ρ}{^} ∥ \overset{σ}{^}) + S_{α} (\overset{ρ}{^}^{'} ∥ \overset{σ}{^}^{'}) .

S_{α} (\overset{ρ}{^} \otimes \overset{ρ}{^}^{'} ∥ \overset{σ}{^} \otimes \overset{σ}{^}^{'}) = S_{α} (\overset{ρ}{^} ∥ \overset{σ}{^}) + S_{α} (\overset{ρ}{^}^{'} ∥ \overset{σ}{^}^{'}) .

B^{ε} (\overset{ρ}{^}) := {\overset{τ}{^} \in S_{\leq} (H) : D (\overset{τ}{^}, \overset{ρ}{^}) ⩽ ε} .

B^{ε} (\overset{ρ}{^}) := {\overset{τ}{^} \in S_{\leq} (H) : D (\overset{τ}{^}, \overset{ρ}{^}) ⩽ ε} .

S_{\infty}^{ε} (\overset{ρ}{^} ∥ \overset{σ}{^})

S_{\infty}^{ε} (\overset{ρ}{^} ∥ \overset{σ}{^})

S_{0}^{ε} (\overset{ρ}{^} ∥ \overset{σ}{^})

S_{1/2}^{ε} (\overset{ρ}{^} ∥ \overset{σ}{^})

S_{0}^{ε} (\overset{ρ}{^})

S_{0}^{ε} (\overset{ρ}{^})

S_{H}^{η} (\overset{ρ}{^} ∥ \overset{σ}{^}) := - ln (η^{- 1} 0 ⩽ \hat{Q} ⩽ \hat{I}, tr [\overset{ρ}{^} \hat{Q}] ⩾ η min tr [\overset{σ}{^} \hat{Q}]) .

S_{H}^{η} (\overset{ρ}{^} ∥ \overset{σ}{^}) := - ln (η^{- 1} 0 ⩽ \hat{Q} ⩽ \hat{I}, tr [\overset{ρ}{^} \hat{Q}] ⩾ η min tr [\overset{σ}{^} \hat{Q}]) .

S_{H}^{η} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩾ S_{H}^{η} (E (\overset{ρ}{^}) ∥ E (\overset{σ}{^})) .

S_{H}^{η} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩾ S_{H}^{η} (E (\overset{ρ}{^}) ∥ E (\overset{σ}{^})) .

S_{H}^{η} (\overset{ρ}{^} ∥ a \overset{σ}{^}) = S_{H}^{η} (\overset{ρ}{^} ∥ \overset{σ}{^}) - ln (a),

S_{H}^{η} (\overset{ρ}{^} ∥ a \overset{σ}{^}) = S_{H}^{η} (\overset{ρ}{^} ∥ \overset{σ}{^}) - ln (a),

S_{H}^{η} (\overset{ρ}{^} ∥ \overset{σ}{^}^{'}) ⩽ S_{H}^{η} (\overset{ρ}{^} ∥ \overset{σ}{^}),

S_{H}^{η} (\overset{ρ}{^} ∥ \overset{σ}{^}^{'}) ⩽ S_{H}^{η} (\overset{ρ}{^} ∥ \overset{σ}{^}),

\displaystyle{S}_{\mathrm{H}}^{\eta+\varepsilon}(\hat{\rho}\,\|\,\hat{\sigma})\leqslant{S}_{\mathrm{H}}^{\eta}(\hat{\rho}^{\prime}\,\|\,\hat{\sigma})+\ln\Bigl{(}\frac{\eta+\varepsilon}{\eta}\Bigr{)}\ .

\displaystyle{S}_{\mathrm{H}}^{\eta+\varepsilon}(\hat{\rho}\,\|\,\hat{\sigma})\leqslant{S}_{\mathrm{H}}^{\eta}(\hat{\rho}^{\prime}\,\|\,\hat{\sigma})+\ln\Bigl{(}\frac{\eta+\varepsilon}{\eta}\Bigr{)}\ .

S_{H}^{1 - ε^{2} /6} (\overset{ρ}{^} ∥ \overset{σ}{^}) - ln (\frac{1 - ε ^{2} /6}{ε ^{2} /6}) ⩽ S_{0}^{ε} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩽ S_{H}^{1 - ε} (\overset{ρ}{^} ∥ \overset{σ}{^}) - ln (1 - ε);

S_{H}^{1 - ε^{2} /6} (\overset{ρ}{^} ∥ \overset{σ}{^}) - ln (\frac{1 - ε ^{2} /6}{ε ^{2} /6}) ⩽ S_{0}^{ε} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩽ S_{H}^{1 - ε} (\overset{ρ}{^} ∥ \overset{σ}{^}) - ln (1 - ε);

S_{H}^{2 ε} (\overset{ρ}{^} ∥ \overset{σ}{^}) - ln (2) ⩽ S_{\infty}^{ε} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩽ S_{H}^{ε^{2} /2} (\overset{ρ}{^} ∥ \overset{σ}{^}) .

S_{1/2}^{2 ε} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩾ S_{0}^{2 ε} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩾ S_{1/2}^{ε} (\overset{ρ}{^} ∥ \overset{σ}{^}) - 6 ln (\frac{3}{ε}) .

S_{1/2}^{2 ε} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩾ S_{0}^{2 ε} (\overset{ρ}{^} ∥ \overset{σ}{^}) ⩾ S_{1/2}^{ε} (\overset{ρ}{^} ∥ \overset{σ}{^}) - 6 ln (\frac{3}{ε}) .

S_{1} (P) := n \to \infty lim \frac{1}{n} S_{1} (\overset{ρ}{^}_{n}),

S_{1} (P) := n \to \infty lim \frac{1}{n} S_{1} (\overset{ρ}{^}_{n}),

S_{1} (P ∥ Σ) := n \to \infty lim \frac{1}{n} S_{1} (\overset{ρ}{^}_{n} ∥ \overset{σ}{^}_{n}) .

S_{1} (P ∥ Σ) := n \to \infty lim \frac{1}{n} S_{1} (\overset{ρ}{^}_{n} ∥ \overset{σ}{^}_{n}) .

\overline{S} (P ∥ Σ) := ε \to + 0 lim n \to \infty lim sup \frac{1}{n} S_{\infty}^{ε} (\overset{ρ}{^}_{n} ∥ \overset{σ}{^}_{n}),

\overline{S} (P ∥ Σ) := ε \to + 0 lim n \to \infty lim sup \frac{1}{n} S_{\infty}^{ε} (\overset{ρ}{^}_{n} ∥ \overset{σ}{^}_{n}),

\underline{S} (P ∥ Σ) := ε \to + 0 lim n \to \infty lim inf \frac{1}{n} S_{0}^{ε} (\overset{ρ}{^}_{n} ∥ \overset{σ}{^}_{n}) .

\underline{S} (P ∥ Σ) := ε \to + 0 lim n \to \infty lim inf \frac{1}{n} S_{0}^{ε} (\overset{ρ}{^}_{n} ∥ \overset{σ}{^}_{n}) .

\overline{S} (P ∥ Σ)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Asymptotic Reversibility of Thermal Operations for Interacting Quantum Spin Systems via Generalized Quantum Stein’s Lemma

Takahiro Sagawa

Department of Applied Physics, The University of Tokyo, Tokyo 113-8656, Japan

Philippe Faist

Institute for Quantum Information and Matter, California Institute of Technology, Pasadena, CA 91125, USA

Institute for Theoretical Physics, ETH Zurich, 8093 Switzerland

Dahlem Center for Complex Quantum Systems, Freie Universität Berlin, 14195 Berlin, Germany

Kohtaro Kato

Institute for Quantum Information and Matter, California Institute of Technology, Pasadena, CA 91125, USA

Keiji Matsumoto

National Institute of Informatics, Tokyo 101-8430, Japan

Hiroshi Nagaoka

The University of Electro-Communications, Tokyo, 182-8585, Japan

Fernando G. S. L. Brandão

Institute for Quantum Information and Matter, California Institute of Technology, Pasadena, CA 91125, USA

Abstract

For quantum spin systems in any spatial dimension with a local, translation-invariant Hamiltonian, we prove that asymptotic state convertibility from a quantum state to another one by a thermodynamically feasible class of quantum dynamics, called thermal operations, is completely characterized by the Kullback-Leibler (KL) divergence rate, if the state is translation-invariant and spatially ergodic. Our proof consists of two parts and is phrased in terms of a branch of the quantum information theory called the resource theory. First, we prove that any states, for which the min and max Rényi divergences collapse approximately to a single value, can be approximately reversibly converted into one another by thermal operations with the aid of a small source of quantum coherence. Second, we prove that these divergences collapse asymptotically to the KL divergence rate for any translation-invariant ergodic state. We show this via a generalization of the quantum Stein’s lemma for quantum hypothesis testing beyond independent and identically distributed (i.i.d.) situations. Our result implies that the KL divergence rate serves as a thermodynamic potential that provides a complete characterization of thermodynamic convertibility of ergodic states of quantum many-body systems in the thermodynamic limit, including out-of-equilibrium and fully quantum situations.

I Introduction
II Preliminaries
II.1 Entropy and divergence
II.2 Asymptotic spectral divergence rates
III Asymptotic state convertibility by thermal operations
III.1 Thermodynamic operations
III.2 State convertibility by thermal operations
III.3 Proof of Theorems 1 and 2
III.3.1 Discretizing the Hamiltonian
III.3.2 Manipulating coherence in the state
III.3.3 Collapse of the min and max divergences suppresses coherence
III.3.4 Proof of Theorem 1
III.3.5 Proof of Theorem 2
IV Collapse of the min and max divergences for ergodic states relative to local Gibbs states
IV.1 A sufficient condition for quantum Stein’s lemma
IV.2 Formulation of ergodic states and local Gibbs states
IV.3 Generalized Stein’s lemma for ergodic states relative to local Gibbs states
IV.4 Remarks on ergodicity, mixtures, and the KL divergence
IV.4.1 The mixing property
IV.4.2 Mixtures of ergodic states
IV.4.3 The role of the KL divergence for the thermodynamic potential
V Discussion
0.A General technical lemmas
0.B Properties of our thermodynamic framework and convertibility proof for Gibbs-preserving maps
0.C $C^{\ast}$ -algebra formulation
0.D An alternative proof of Theorem 4
0.E The classical case

I Introduction

Reversibility and irreversibility of dynamics in classical and quantum physics, especially in thermodynamics, is characterized thanks to the concept of entropy. It is a salient feature of macroscopic equilibrium thermodynamics that entropy does not only have the non-decreasing property but also provides a complete characterization of convertibility between thermal equilibrium states Callen (1985), which is represented by the second law of thermodynamics. Lieb and Yngvason constructed an axiomatic formulation of this phenomenology, and within their mathematical framework, rigorously proved that entropy provides a necessary and sufficient condition for state conversion, and furthermore, that such an entropy function is essentially unique Lieb and Yngvason (1999).

The connection between microscopic information entropy and thermodynamic entropy has been extensively studied both in terms of statistical mechanics Sagawa (2012); Parrondo et al. (2015) and the thermodynamic resource theory Goold et al. (2016); Chitambar and Gour (2019); Sagawa (2021). In the latter formalism which we adopt in this article, so-called one-shot entropy measures have provided tools to quantify resource costs of physical operations in quantum information settings including quantum thermodynamics Goold et al. (2016); Chitambar and Gour (2019); Brandão et al. (2013); Åberg (2013); Horodecki and Oppenheim (2013); Brandão et al. (2015); Gour et al. (2018); Faist et al. (2015a); Faist and Renner (2018); Weilenmann et al. (2016); Weilenmann (2017); Weilenmann et al. (2018); Sagawa (2021).

Our understanding of the macroscopic behavior of the entropy has been sharpened by fundamental theorems proving asymptotic equipartition properties (AEP). Rougly speaking, an AEP states that in the long sequence limit of a stochastic process, some relevant quantities concentrate to definite values. For instance, the Shannon-McMillan theorem states that an ergodic process satisfies an AEP with the Shannon entropy rate Shannon (1948); Cover and Thomas (2006). This has been generalized to a stronger form known as the Shannon-McMillan-Breiman theorem as well as to a relative version for an ergodic process with respect to a Markov process Algoet and Cover (1988). A quantum version of the Shannon-McMillan theorem proves a similar AEP for quantum ergodic processes with the von Neumann entropy rate Bjelaković et al. (2004); Bjelaković and Szkola (2005); Ogata (2013).

Closely related to AEP theorems is Stein’s lemma, which relates the asymptotic error rate of hypothesis testing for distinguishing two quantum states to the KL divergence rate. Classically, Stein’s lemma is a straightforward consequence of the relative AEP. However, its quantum counterpart is more involved Hiai and Petz (1991); Nagaoka and Ogawa (2000); Bjelakovic and Siegmund-Schultze (2004); Brandão and Plenio (2010); Bjelakovic and Siegmund-Schultze (2003). Hiai and Petz Hiai and Petz (1991) first addressed the quantum Stein’s lemma and provided a partial proof for a completely ergodic quantum state with respect to an i.i.d. state. The proof of the quantum Stein’s lemma was completed for the case where both states are i.i.d. by Ogawa and Nagaoka Nagaoka and Ogawa (2000), by proving the strong converse of the Hiai-Petz theorem for that case. A more general form of the quantum Stein’s lemma for an ergodic state with respect to an i.i.d. state was proved in Ref. Bjelakovic and Siegmund-Schultze (2004), which is regarded as a quantum analog of the relative AEP.

In this work, we go beyond the non-interacting or i.i.d. regime, and investigate an entropy function that provides a thermodynamic characterization of physically relevant, interacting many-body quantum systems. We consider quantum spin systems on the lattice $\mathbb{Z}^{d}$ with an arbitrary number $d$ of spatial dimensions. Under certain general conditions, we rigorously prove that the necessary and sufficient condition for asymptotic state conversion from one ergodic state to another state by thermodynamically feasible quantum dynamics, called thermal operations Brandão et al. (2013), is characterized by the Kullback-Leibler (KL) divergence rate of the state relative to the Gibbs state. The KL divergence rate is shown to determine the work cost for state transformations, and thus plays a role of the proper thermodynamic potential. Our central assumptions are that (i) the quantum state is translation-invariant and spatially ergodic and (ii) the Hamiltonian is translation-invariant and local. Physically, the assumption (i) implies that a quantum state does not exhibit any macroscopic fluctuations if one looks at translation-invariant observables Cover and Thomas (2006); Bratteli and Robinson (1987, 1981); Israel (2015); Ruelle (1999), and the assumption (ii) guarantees the sound thermodynamic limit of the Gibbs state. Importantly, a spatially ergodic state — in contrast to a temporarily ergodic state — is not necessarily a thermal equilibrium state, and thus our result is applicable to out-of-equilibrium situations.

To achieve an operationally robust notion of a thermodynamic potential, we resort to the resource theory of thermal operations. The resource theory of thermal operations is an established model for thermodynamics in the quantum regime Brandão et al. (2013, 2015); Goold et al. (2016); Binder et al. (2018). This approach allows us to study the thermodynamic behavior of arbitrary quantum states in a way that inherently accounts for the fluctuations in the work requirement of state transformations. This model for thermodynamics is tightly related to measures of information introduced in quantum information theory based on the quantum Rényi divergences Rényi (1960). Two quantities in particular, the Rényi-[math] divergence (or min-divergence) and Rényi- $\infty$ divergence (or max-divergence), play a special role in determining the work requirement of state transformations Lieb and Yngvason (2013); Weilenmann et al. (2016). For instance, the work that can be extracted from any state that is block-diagonal in the energy eigenspaces is given by the Rényi-0 divergence. For our main result, we consider the asymptotic version of these quantities for large system sizes, which corresponds to the thermodynamic limit. The asymptotic min and max Rényi divergences are also called the upper and lower spectral divergence rates in the theory of information spectrum, and we will use both terms interchangeably in this paper Han (2003, 2000); Nagaoka and Hayashi (2007); Datta and Renner (2009); Datta (2009); Bowen and Datta (2006a, b); Schoenmakers et al. (2007).

Main result

Our main result is that ergodic states can be reversibly interconverted into one another in the resource theory of thermal operations in the thermodynamic limit. Roughly speaking, if the Hamiltonian is local and translation invariant, then there exists a thermodynamic potential $F(\rho)$ that is defined for all translation invariant and ergodic states $\rho$ on a lattice of $d$ spatial dimensions with the following property: For any two translation invariant and ergodic states $\rho,\rho^{\prime}$ , there exists a (generalized) thermal operation that can carry out the transformation $\rho\to\rho^{\prime}$ by investing work at a rate of $F(\rho^{\prime})-F(\rho)$ per subsystem and that uses a negligible amount of coherence per subsystem. Furthermore, $F(\rho)$ is given by the KL divergence rate between $\rho$ and the Gibbs state $\sigma$ of the Hamiltonian, divided by the temperature of the heat bath.

Our main result is proved in the following two steps. They are discussed in Section III and Section IV, where the main theorems are Theorem 2 and Theorem 3, respectively. Both of them can be of independent interest.

First, we prove that any state for which the min and max Rényi divergences coincide approximately Renner (2005); Tomamichel (2016) can approximately be converted reversibly to and from the Gibbs state by thermal operations, using a small source of quantum coherence Lostaglio et al. (2015a). In this case, the resource theory becomes reversible, i.e., the work required for a state transformation is equal to the negative work required for the reverse transformation. In consequence, if these divergences coincide to a single value in the asymptotic limit, then it defines a thermodynamic potential that completely characterizes the possible state transformations in the fully quantum regime. This is a result that applies broadly to the resource theory of thermal operations in general settings, even for states that are non-classical, i.e., that are not block-diagonal in the energy basis. This intermediate result, which is independent of the assumptions (i) and (ii), can be of independent interest.

Second, we prove that the min and max Rényi divergences indeed collapse to the KL divergence rate under the assumptions (i) and (ii). To this end, we prove a generalization of the quantum Stein’s lemma to the setting with (i) and (ii). The main idea of our proof, inspired by Refs. Bjelakovic and Siegmund-Schultze (2004, 2003), is to construct typical projectors that are adapted to the assumptions (i) and (ii). Our formulation uses semidefinite programming to simplify some parts of the proof.

Structure of the paper

In Section II, we introduce preliminary definitions and notation, including the relevant divergences and entropy measures. In Section III, we introduce our thermodynamic framework of thermal operations, giving a rigorous meaning to the work cost of a transformation from one state to another, and prove our first main theorem on asymptotic thermal operations (Theorem 2). In Section IV, we rigorously formulate ergodicity, and prove our second main theorem on the generalized quantum Stein’s lemma (Theorem 3). We conclude with remarks and an outlook in Section V. In the appendices, we remark on some technical lemmas, Gibbs-preserving maps, a more rigorous approach to ergodicity formulated using $C^{\ast}$ -algebras, an alternative proof of our second main theorem for the one dimensional case, and purely classical implications of our results.

II Preliminaries

Consider a Hilbert space $\mathscr{H}$ of finite dimension $D$ , and let ${\mathcal{S}}(\mathscr{H})$ be the set of density operators (quantum states) on $\mathscr{H}$ , satisfying $\hat{\rho}\geqslant 0$ and $\operatorname{tr}[\hat{\rho}]=1$ for $\hat{\rho}\in{\mathcal{S}}(\mathscr{H})$ . We also define the set of subnormalized states, which we denote by ${\mathcal{S}_{\leq}}(\mathcal{H})$ , and which is the set of all operators $\hat{\rho}\geqslant 0$ that satisfy $\operatorname{tr}[\hat{\rho}]\leqslant 1$ . For two Hilbert spaces $\mathscr{H}_{A}$ and $\mathscr{H}_{B}$ representing systems $A$ and $B$ , we write $A\simeq B$ when the Hilbert spaces are isomorphic; by convention, the identity mapping $A\to B$ maps the canonical basis of $A$ onto the canonical basis of $B$ .

The set of quantum states carries a natural metric given by the trace distance Nielsen and Chuang (2000), defined as $D(\hat{\rho},\hat{\rho}^{\prime})=(1/2)\lVert{\hat{\rho}-\hat{\rho}^{\prime}}\rVert_{1}$ for any $\hat{\rho},\hat{\rho}^{\prime}\in{\mathcal{S}}(\mathscr{H})$ , where $\lVert{\cdot}\rVert_{1}$ is the Schatten 1-norm. This metric can be extended to subnormalized states $\hat{\rho},\hat{\rho}^{\prime}\in{\mathcal{S}_{\leq}}(\mathscr{H})$ as the generalized trace distance Tomamichel et al. (2010); Tomamichel (2012), defined as

[TABLE]

We also define the fidelity Nielsen and Chuang (2000) as $F(\hat{X},\hat{Y})=\lVert{\hat{X}^{1/2}\hat{Y}^{1/2}}\rVert_{1}$ for any $\hat{X},\hat{Y}\geqslant 0$ .

II.1 Entropy and divergence

Thermodynamic properties of microscopic quantum systems can be described using entropy measures that generalize the usual Shannon or von Neumann entropy to the so-called “one-shot” regime Renner (2005); Tomamichel (2012, 2016). More specifically, in the presence of thermodynamic reservoirs, we need to consider a family of relative entropies, or divergences. For $\hat{\rho}\in{\mathcal{S}_{\leq}}(\mathscr{H})$ and $\hat{\sigma}\geqslant 0$ , the KL divergence (Rényi- $1$ divergence) is defined as:

[TABLE]

Throughout this paper, we assume that the first argument of the divergences considered (here $\hat{\rho}$ ) lies within the support of the second argument (here $\hat{\sigma}$ ). This assumption is physically justified when $\hat{\sigma}$ is a Gibbs state, which necessarily has full rank. The min divergence (Rényi-[math] divergence), or the min relative entropy, is defined as

[TABLE]

where $\hat{P}_{\rho}$ is the projection onto the support of $\hat{\rho}$ . We also define an alternative measure of the min divergence (Rényi- $1/2$ divergence) as

[TABLE]

Finally, the max divergence (Rényi- $\infty$ divergence), or the max relative entropy, is defined as

[TABLE]

where $\lVert{\cdot}\rVert_{\infty}$ is the operator norm.

These quantities are special cases of the Rényi- $\alpha$ divergences. Here, we avoid technicalities and issues in the general definitions of the quantum Rényi divergences caused by the noncommutativity of the arguments Hiai et al. (2011); Wilde et al. (2014); Tomamichel (2016), by focusing on the quantities above which are sufficient for our purposes. These divergences satisfy

[TABLE]

From these divergences we can define corresponding entropy measures as the divergence with respect to the identity operator $\hat{I}$ : For $\alpha=0,1/2,1,\infty$ we define

[TABLE]

We note the following explicit forms of the von Neumann entropy (Rényi- $1$ entropy) ${S}_{1}(\hat{\rho})$ , the max entropy (Rényi-[math] entropy) ${S}_{0}({\hat{\rho}})$ , and the min entropy (Rényi- $\infty$ entropy) ${S}_{\infty}({\hat{\rho}})$ ,

[TABLE]

The entropies are ordered as

[TABLE]

These divergences satisfy the data processing inequality, i.e., they are monotonous under the action of a completely-positive (CP) and trace-preserving (TP) map $E$ :

[TABLE]

For $\alpha=0,1$ , see for example Lemma 7 of Ref. Datta (2009). The case of $\alpha=1$ is equivalent to the strong subadditivity of the von Neumann entropy Nielsen and Chuang (2000); Lieb and Ruskai (1973). Consequently, the entropies do not decrease under the action of a CPTP map $E$ that is unital, i.e., $E(\hat{I})=\hat{I}$ ,

[TABLE]

A useful property of these divergences is a monotonicity property for the semidefinite ordering of the second argument: If $\sigma\leqslant\sigma^{\prime}$ , then for each $\alpha=0,1/2,1,\infty$ ,

[TABLE]

The divergences obey a scaling property in the second argument. For $\alpha=0,1/2,1,\infty$ , we have for any $a>0$ ,

[TABLE]

Under tensor product states, the divergences become additive. For $\alpha=0,1/2,1,\infty$ , we have for any $\hat{\rho}\in{\mathcal{S}_{\leq}}(\mathscr{H}),\hat{\rho}^{\prime}\in{\mathcal{S}_{\leq}}(\mathscr{H}^{\prime})$ , $\hat{\sigma}\geqslant 0,\hat{\sigma}^{\prime}\geqslant 0$ ,

[TABLE]

To ensure that the operational quantities represented by these entropies and divergences do not significantly depend on events that only appear with vanishingly small probability, we “smoothe” these entropies and divergences over a ball of states that are close to the original state Renner (2005); Datta (2009). First, we define the $\varepsilon$ -ball of states around a subnormalized state $\hat{\rho}\in{\mathcal{S}_{\leq}}(\mathscr{H})$ as

[TABLE]

Definition 1 (Smooth divergences Datta (2009)).

The smooth divergences are defined as follows,

[TABLE]

The smooth entropies are defined correspondingly as

[TABLE]

We introduce a further convenient divergence (relative entropy) that is based on hypothesis testing Tomamichel and Hayashi (2013); Dupuis et al. (2013); Faist and Renner (2018). This divergence allows to interpolate between the min- and max-divergences in a different fashion than the Rényi entropies, along with a simple formulation and a collection of useful properties. For a subnormalized state $\hat{\rho}$ and $\hat{\sigma}\geqslant 0$ , we define for any $0<\eta\leqslant\operatorname{tr}(\hat{\rho})$ ,

[TABLE]

The hypothesis testing divergence owes its name to the fact that if $\hat{\rho},\hat{\sigma}$ are two quantum states, $\eta\exp(-{S}_{\mathrm{H}}^{\eta}(\hat{\rho}\,\|\,\hat{\sigma}))$ represents the probability of mistakenly reporting $\hat{\rho}$ in a hypothesis test between the two states, if we carry out a strategy that mistakenly reports $\hat{\sigma}$ with probability at most $1-\eta$ .

The hypothesis testing divergence satisfies the data processing inequality Wang and Renner (2012): For any subnormalized state $\hat{\rho}$ , for any $\hat{\sigma}\geqslant 0$ , for any CP and trace-nonincreasing map $E$ , and for any $0<\eta\leqslant\operatorname{tr}(E(\hat{\rho}))$ , the hypothesis testing divergence is monotonic,

[TABLE]

The hypothesis testing entropy also obeys a scaling property in the second argument: For any subnormalized state $\hat{\rho}$ , for any $\hat{\sigma}\geqslant 0$ , and for any $0<\eta\leqslant\operatorname{tr}(\hat{\rho})$ ,

[TABLE]

as can be directly seen from (18). Also, for any $\hat{\sigma},\hat{\sigma}^{\prime}\geqslant 0$ for which $\hat{\sigma}\leqslant\hat{\sigma}^{\prime}$ , the hypothesis testing entropy satisfies

[TABLE]

for any subnormalized state $\hat{\rho}$ and for any $0<\eta\leqslant\operatorname{tr}(\hat{\rho})$ . Furthermore, if $D(\hat{\rho}^{\prime},\hat{\rho})\leqslant\varepsilon$ , then $\hat{\rho}^{\prime}\geqslant\hat{\rho}-\hat{\Delta}$ for some $\hat{\Delta}\geqslant 0$ with $\operatorname{tr}(\hat{\Delta})\leqslant\varepsilon$ and hence for any $0<\eta\leqslant\eta+\varepsilon\leqslant\operatorname{tr}(\hat{\rho})$ ,

[TABLE]

A useful property of the hypothesis testing divergence is that it interpolates between the min and max divergences, which are approximately recovered in the regimes $\eta\simeq 0$ and $\eta\simeq 1$ , respectively Dupuis et al. (2013):

Proposition 1.

Let $\hat{\rho}$ be a (normalized) quantum state and let $\hat{\sigma}\geqslant 0$ . For any $0<\varepsilon<1/2$ ,

[TABLE]

Proof.

The proof of (Faist and Renner, 2018, Lemma 40) carries through even for the slightly different smoothing of ${S}_{0}$ and ${S}_{\infty}$ , except for the upper bound on ${S}_{\infty}$ . There, we may apply (Dupuis et al., 2013, Proposition 4.1) directly. ∎

Finally, we note a pair of inequalities which establishes the approximate equivalence of the two kinds of min-divergences Tomamichel et al. (2011); Dupuis et al. (2013); Tomamichel (2016).

Proposition 2.

Let $\hat{\rho}$ be a normalized state and let $\hat{\sigma}\geqslant 0$ . For any $\varepsilon>0$ ,

[TABLE]

Proof.

The first inequality follows because of (6). For the second inequality, let $\hat{\rho}^{\prime}\in B^{\varepsilon}(\hat{\rho})$ such that ${S}_{1/2}^{\varepsilon}(\hat{\rho}\,\|\,\hat{\sigma})={S}_{1/2}(\hat{\rho}^{\prime}\,\|\,\hat{\sigma})$ . Then from (Dupuis et al., 2013, Proposition 4.2), we have ${S}_{1/2}(\hat{\rho}^{\prime}\,\|\,\hat{\sigma})\leqslant{S}_{\mathrm{H}}^{1-\varepsilon^{\prime}}(\hat{\rho}^{\prime}\,\|\,\hat{\sigma})-\ln(\varepsilon^{\prime 2})$ for any $\varepsilon^{\prime}>0$ ; choosing $\varepsilon^{\prime}=\varepsilon^{2}/6$ and using Proposition 1, we find ${S}_{1/2}(\hat{\rho}^{\prime}\,\|\,\hat{\sigma})\leqslant{S}_{0}^{\varepsilon}(\hat{\rho}^{\prime}\,\|\,\hat{\sigma})+\ln[(1-\varepsilon^{\prime})/\varepsilon^{\prime}]-\ln(\varepsilon^{\prime 2})$ . The claim follows by noting that ${S}_{0}^{\varepsilon}(\hat{\rho}^{\prime}\,\|\,\hat{\sigma})\leqslant{S}_{0}^{2\varepsilon}(\hat{\rho}\,\|\,\hat{\sigma})$ along with $(1-\varepsilon^{\prime})/(\varepsilon^{\prime 3})\leqslant(6\varepsilon^{-2})^{3}\leqslant(3\varepsilon^{-1})^{6}$ . ∎

II.2 Asymptotic spectral divergence rates

In statistical mechanics one is often interested in the thermodynamic limit, where the behavior of the system as it becomes arbitrarily large often no longer depends on microscopic details. The action of taking the thermodynamic limit is formalized by considering a sequence of states $\widehat{P}:=\{\hat{\rho}_{n}\}_{n\in\mathbb{N}}$ , where $\hat{\rho}_{n}$ is a quantum state on $\mathscr{H}^{\otimes n}$ .

The von Neumann entropy rate is defined as

[TABLE]

and the KL divergence rate with respect to the sequence of positive operators $\widehat{\Sigma}:=\{\hat{\sigma}_{n}\}_{n\in\mathbb{N}}$ is defined as

[TABLE]

We note that these limits do not necessarily exist in general.

We now introduce the spectral divergence rates, which are natural extensions of the min and max divergences to the thermodynamic limit.

Definition 2 (Spectral divergence rates).

Let $\widehat{P}=\{\hat{\rho}_{n}\}$ be a sequence of states and let $\widehat{\Sigma}=\{\sigma_{n}\}$ be a sequence of positive operators. We define the upper spectral divergence rate,

[TABLE]

and the lower spectral divergence rate,

[TABLE]

These quantities have been introduced in Ref. Nagaoka and Hayashi (2007) in an equivalent but different expression:

[TABLE]

where $\operatorname{Proj}\bigl{\{}\hat{X}\geqslant 0\bigr{\}}$ represents the projector onto the eigenspaces of $\hat{X}$ corresponding to nonnegative eigenvalues. The equivalence of these two definitions has been proved in Theorems 2 and 3 of Ref. Datta (2009). We note that

[TABLE]

As a special case, we introduce the lower and the upper spectral entropy rates, which are respectively given by

[TABLE]

where $\widehat{\mathrm{ID}}:=\{\hat{I}^{\otimes n}\}_{n\in\mathbb{N}}$ is the sequence consisting of identity operators on $\mathscr{H}^{\otimes n}$ .

We can also define the hypothesis testing divergence rate

[TABLE]

noting that the limit does not necessarily exist. From Proposition 1, in general, ${S}_{\mathrm{H}}^{\varepsilon}(\hat{\rho}\,\|\,\hat{\sigma})$ and ${S}_{\mathrm{H}}^{1-\varepsilon}(\hat{\rho}\,\|\,\hat{\sigma})$ respectively give the same lower and upper spectral divergence rates as those given by ${S}_{\infty}^{\varepsilon}(\hat{\rho}\,\|\,\hat{\sigma})$ and ${S}_{0}^{\varepsilon}(\hat{\rho}\,\|\,\hat{\sigma})$ :

[TABLE]

III Asymptotic state convertibility by thermal operations

In this section, we formulate thermal operations and prove our first main theorem on asymptotic state convertibility (Theorem 2). Importantly, in the microscopic regime, state transformations are not reversible in general, not even approximately. For general states $\hat{\rho},\hat{\rho}^{\prime}$ , it might happen that $\hat{\rho}$ can be approximately converted to $\hat{\rho}^{\prime}$ with work extraction $w$ , but that an approximate transformation from $\hat{\rho}^{\prime}$ to $\hat{\rho}$ requires much more work than $w$ Horodecki and Oppenheim (2013).

Then we can ask the question, under which conditions is reversibility restored? This is an important question, because reversibility implies that the optimal work cost derives from a potential, which in turn means that macroscopic thermodynamic behavior is restored. Here, we consider in fact a marginally stronger property. Under which conditions is a state reversibly convertible to the thermal state? Clearly, any two states that have this property can reversibly be converted into one another. This slightly stronger statement ensures that the thermodynamic potential is well defined for the thermal state itself, a desirable feature that allows the thermal state to take on the role of a “reference state.”

III.1 Thermodynamic operations

We now introduce our thermodynamic framework. The simple model we introduce captures the relevant features of thermodynamics at the microscopic scale, while providing a simple, abstract, and general formalism for analyzing the resource cost of transforming one quantum state into another Goold et al. (2016).

The goal is the following. Given a system $S$ , and two states $\hat{\rho}_{S},\hat{\rho}^{\prime}_{S}$ , we would like to quantify the resources required in order to convert $\hat{\rho}_{S}$ to $\hat{\rho}^{\prime}_{S}$ in some reasonable thermodynamic model. The resource theory of thermal operations is an established model that is particularly useful in such a context. It specifies the set of transformations that can be carried out for free, without the involvement of external resources such as thermodynamic work. In the model of thermal operations, one is allowed to carry out for free any unitary on the system and a heat bath at fixed background temperature, as long as the unitary commutes with the overall noninteracting Hamiltonian of the system and the bath. Here we introduce a slightly generalized notion of thermal operations, where different input and output systems are allowed.

Definition 3 ((Generalized) thermal Operation).

Consider systems $S,S^{\prime}$ with corresponding Hamiltonians $\hat{H}_{S},\hat{H}^{\prime}_{S^{\prime}}$ . Then a CP and trace-nonincreasing map $\Phi^{[\mathrm{TO}]}_{S\to S^{\prime}}(\cdot)$ is a thermal operation at inverse temperature $\beta>0$ if it can be written as

[TABLE]

for some ancilla system $B$ of finite dimension with some corresponding Hamiltonian $\hat{H}_{B}$ , and for some partial isometry $\hat{V}_{SB\to S^{\prime}B}$ such that $\hat{V}_{SB\to S^{\prime}B}\,({\hat{H}_{S}+\hat{H}_{B}})=({\hat{H}^{\prime}_{S^{\prime}}+\hat{H}_{B}})\,\hat{V}_{SB\to{}S^{\prime}B}$ .

If there exists a thermal operation that maps $\hat{\rho}_{S}$ to $\hat{\rho}^{\prime}_{S^{\prime}}$ , we write $(\hat{\rho}_{S},\hat{H}_{S})\xrightarrow[\mathrm{TO}]{}(\hat{\rho}^{\prime}_{S^{\prime}},\hat{H}^{\prime}_{S})$ . We may omit the Hamiltonians if they are clear from context.

Furthermore, a process that is achieved in the limit of processes of the form (34) with arbitrarily large but finite bath systems, is also called a thermal operation.

The last condition is required to enable processes that decrease the rank of the input state, for instance, a process consisting of Landauer erasure of a single bit compensated by a suitable energy shift Horodecki and Oppenheim (2013).

An operator $\hat{V}$ is a partial isometry if it is an isometry on its support, or equivalently if $\hat{V}^{\dagger}\hat{V}$ and $\hat{V}\hat{V}^{\dagger}$ are projectors. We allow $\hat{V}$ in the definition above to be a partial isometry instead of a unitary as considered in Refs. Brandão et al. (2013); Horodecki and Oppenheim (2013); Brandão et al. (2015) because they are more convenient when considering input and output systems of different dimension. Physically, this corresponds to specifying only a part of the process happening on an input subspace. Importantly, any partial isometry that conserves energy can be dilated to a full unitary that conserves energy on a larger system Faist et al. (2021), as illustrated in Fig. 1. We prove a corresponding general statement as Proposition 13 in Appendix 0.B.

There are no known general conditions under which state transformations are possible with thermal operations in the quantum regime. For semiclassical states, i.e. states that are block-diagonal in energy, such conditions are provided in the form of thermomajorization, a generalization of matrix majorization Horodecki and Oppenheim (2013).

Now we introduce an alternative model known as Gibbs-preserving maps. This model has a simple technical formulation which makes it more convenient to prove some properties. Because any thermal operation is in particular a Gibbs-preserving map, all properties obeyed by Gibbs-preserving maps are inherited by thermal operations. As for thermal operations, it is technically more convenient to consider trace-nonincreasing maps; furthermore we allow these maps to be Gibbs-sub-preserving in the sense of the following definition.

Definition 4 (Gibbs-sub-preserving map).

Consider systems $S,S^{\prime}$ with corresponding Hamiltonians $\hat{H}_{S},\hat{H}^{\prime}_{S^{\prime}}$ . Then a CP and trace-nonincreasing map $\Phi^{[\mathrm{GPM}]}_{S\to S^{\prime}}(\cdot)$ is said to be a Gibbs-sub-preserving map for some fixed inverse temperature $\beta$ if

[TABLE]

When there exists a Gibbs-sub-preserving map that maps $\hat{\rho}_{S}$ to $\hat{\rho}^{\prime}_{S}$ , we write $(\hat{\rho}_{S};\hat{H}_{S})\xrightarrow[\mathrm{GPM}]{}(\hat{\rho}^{\prime}_{S^{\prime}};\hat{H}^{\prime}_{S^{\prime}})$ . We may omit the Hamiltonians if they are clear from context.

We note that any Gibbs-sub-preserving map can be dilated into a fully trace-preserving map on a larger system which furthermore has the thermal state as a fixed point (Faist and Renner, 2018, Proposition 2).

Lemma 1.

Any thermal operation is also a Gibbs-sub-preserving map.

Proof.

A thermal operation $\Phi^{[\mathrm{TO}]}_{S\to S^{\prime}}$ can be written in the form (34). We abbreviate $\hat{V}_{SB\to S^{\prime}B}$ as $\hat{V}$ . Then with $Z_{B}=\operatorname{tr}(e^{-\beta\hat{H}_{B}})$ , we have

[TABLE]

where we have invoked Proposition 12 to see that $\hat{V}^{\dagger}\hat{V}$ commutes with $\hat{H}_{S}+\hat{H}_{B}$ (for the second equality) and that $\hat{V}\hat{V}^{\dagger}$ commutes with $\hat{H}^{\prime}_{S^{\prime}}+\hat{H}_{B}$ (for the final inequality). ∎

While any thermal operation is a Gibbs-sub-preserving map as shown in Lemma 1, the converse is not true Faist et al. (2015b). A notable difference between thermal operations and Gibbs-preserving maps is the way the two models handle coherent superpositions of energy states. Thermal operations cannot create any coherent superpositions of energy levels because they commute with time evolution. However, there exist Gibbs-preserving maps that can generate coherent superpositions of energy levels Faist et al. (2015b).

The divergences defined above play an important role in our thermodynamic framework as they are monotones under thermodynamic transformations. In the following, we exploit the scaling property (13) of the divergences to write the expression ${S}_{\alpha}(\hat{\rho}_{S}\,\|\,e^{-\beta\hat{H}_{S}}/Z_{S})-\ln(Z_{S})={S}_{\alpha}(\hat{\rho}_{S}\,\|\,e^{-\beta\hat{H}_{S}})$ more compactly by absorbing the system free energy into the divergence term.

Proposition 3 (Monotonicity of divergences Horodecki and Oppenheim (2013); Brandão et al. (2015); Faist and Renner (2018); Tomamichel (2016)).

Consider systems $S,S^{\prime}$ with corresponding Hamiltonians $\hat{H}_{S},\hat{H}^{\prime}_{S^{\prime}}$ . If $\hat{\rho}_{S},\hat{\rho}_{S^{\prime}}^{\prime}$ are (normalized) quantum states that satisfy $\hat{\rho}_{S}\xrightarrow[*]{}\hat{\rho}_{S^{\prime}}^{\prime}$ , where $*$ stands for either TO or GPM, then

[TABLE]

where $\alpha$ may be any of [math], $1/2$ , $1$ , or $\infty$ and where $0<\eta\leqslant 1$ .

The proof of Proposition 3 is essentially an application of the data processing inequality (10). The full proof requires a dilation of the trace-nonincreasing map into a trace-preserving one, and it is presented in Appendix 0.B.

Now that we have specified the free operations, we need to specify how we can provide resources for thermodynamic operations that are not free, or how we can extract such resources from states.

Thermodynamic work can be provided with the help of an external work storage system, often called a “battery.” This can be any system which starts in a definite energy level and finishes in a different energy level; the difference in energy is then the amount of work furnished or extracted. In fact, a large collection of different battery models are equivalent Brandão et al. (2015); Faist and Renner (2018).

Thermal operations necessarily commute with the free time evolution, as can be seen from (34). This means that it is impossible to create any state that has a coherent superposition of energy levels, even with an arbitrary amount of work, without access to another resource that provides coherence Lostaglio et al. (2015a). Coherence is thus a valuable resource that should be accounted for Åberg (2014); Lostaglio et al. (2015a); Korzekwa et al. (2016); Winter and Yang (2016); Marvian (2020). Here, we adopt a rudimentary, ad hoc model. We suppose that we have access to an additional system $C$ initialized into a pure state of our choosing. Crucially, we assume that the range of energy values that can be stored into the system $C$ is bounded by some parameter $\eta$ , i.e., $\lVert{\hat{H}_{C}}\rVert_{\infty}\leqslant\eta$ where $\hat{H}_{C}$ is the Hamiltonian of $C$ . The system $C$ must be restored to a state that is close to a pure state. The bound on the norm of the Hamiltonian forbids any embezzlement of work of more than of the order of $\eta$ Brandão et al. (2015). The requirement that the final state on $C$ is close to a pure state is necessary because there is no constraint on the dimensionality of $C$ ; with a suitable highly degenerate system, starting from a pure state and finishing in the maximally mixed state would allow to extract an arbitrary amount of work that is not controlled by $\eta$ .

This crude model for accounting for coherence suffices for our purposes, as the protocols we construct only require an ancilla system $C$ with a parameter $\eta$ that is negligibly small compared to the overall work cost of the transformation. Note that this scheme differs from catalysis Brandão et al. (2015); Ng et al. (2015); Lostaglio et al. (2015b) as we do not require the final state to be related in any way to the initial state.

Definition 5 (Work/coherence-assisted process).

Consider systems $S,S^{\prime}$ with corresponding Hamiltonians $\hat{H}_{S},\hat{H}^{\prime}_{S^{\prime}}$ and let $*$ stand for TO or GPM. We say that a CP and trace-nonincreasing map $\Phi_{S\to S^{\prime}}$ is a $(w,\eta)$ -work/coherence-assisted $*$ operation, if there exist systems $W,C,W^{\prime},C^{\prime}$ with respective Hamiltonians $\hat{H}_{W},\hat{H}_{C},\hat{H}_{W^{\prime}},\hat{H}_{C^{\prime}}$ satisfying $\lVert{\hat{H}_{C}}\rVert_{\infty}\leqslant\eta$ , $\lVert{\hat{H}_{C^{\prime}}}\rVert_{\infty}\leqslant\eta$ , and if there exist two energy eigenstates $\lvert{E}\rangle_{W},\lvert{E^{\prime}}\rangle_{W^{\prime}}$ of $\hat{H}_{W},\hat{H}_{W^{\prime}}$ respectively whose energies $E$ and $E^{\prime}$ satisfy $E-E^{\prime}=w$ , and if there exist two pure states $\lvert{\zeta}\rangle_{C},\lvert{\zeta^{\prime}}\rangle_{C^{\prime}}$ , and if there exists a $*$ operation $\tilde{\Phi}^{[*]}_{SCW\to S^{\prime}C^{\prime}W^{\prime}}$ , such that

[TABLE]

Here, we allow infinite-dimensional Hilbert spaces for $C$ and $C^{\prime}$ for technical reasons related to how to construct $\lvert{\zeta}\rangle_{C}$ states.

A $(w,\eta)$ -work/coherence-assisted thermal operation is thus simply a free process that is assisted by ancillas that provide an amount of work $w$ and an “amount of coherence” that is at most $\eta$ . If $w$ is negative, then this measures the amount of work that is extracted by the process.

Definition 6 (Approximate thermodynamic process using work and coherence).

Consider systems $S,S^{\prime}$ with Hamiltonians $\hat{H}_{S},\hat{H}^{\prime}_{S^{\prime}}$ and let $*$ stand for TO or GPM. We say that the state $\hat{\rho}_{S}$ is $(w,\eta,\varepsilon)$ -transformable into $\hat{\rho}^{\prime}_{S^{\prime}}$ by a $*$ process, which we denote by $(\hat{\rho}_{S};\hat{H}_{S})\xrightarrow[\mathrm{*}]{w,\eta,\varepsilon}(\hat{\rho}_{S}^{\prime};\hat{H}^{\prime}_{S^{\prime}})$ , if there exists a $(w,\eta)$ -work/coherence-assisted $*$ process $\Phi_{S\to S^{\prime}}$ such that $D(\Phi_{S\to S^{\prime}}(\hat{\rho}_{S}),\hat{\rho}^{\prime}_{S})\leqslant\varepsilon$ . We may omit the Hamiltonians if they are clear from context.

The hypothesis testing divergence is a relatively good (quasi) monotone under assisted thermodynamic operations: It can only decrease, except for correction terms that depend on $w,\eta,\varepsilon$ . Because the proof is not particularly insightful, we defer it to Appendix 0.B.

Proposition 4 (Quasi-monotonicity of the hypothesis testing divergence

under resource-assisted transformations).

Consider systems $S,S^{\prime}$ with respective Hamiltonians $\hat{H}_{S},\hat{H}^{\prime}_{S^{\prime}}$ . For a quantum state $\hat{\rho}_{S}$ and a subnormalized state $\hat{\rho}^{\prime}_{S^{\prime}}$ , suppose $\hat{\rho}_{S}\xrightarrow[\mathrm{*}]{w,\,\eta,\,\varepsilon}\hat{\rho}^{\prime}_{S^{\prime}}$ , where $*$ stands for TO or GPM. Then for any $0<\xi\leqslant\xi+\varepsilon\leqslant\operatorname{tr}(\hat{\rho}^{\prime}_{S^{\prime}})$ ,

[TABLE]

Finally, we define asymptotic transformations. These are transformations in the thermodynamic limit for which we are interested in the work cost rate, and which use only a sublinear amount of coherence.

Definition 7 (Asymptotic thermodynamic process).

Consider two sequences of states $\widehat{P}=\{\hat{\rho}_{n}\}$ and $\widehat{P}^{\prime}=\{\hat{\rho}^{\prime}_{n}\}$ and two sequences of Hamiltonians $\widehat{\mathcal{H}}=\{\hat{H}_{n}\}$ , $\widehat{\mathcal{H}}^{\prime}=\{\hat{H}^{\prime}_{n}\}$ . Let $*$ stand for TO or GPM. We say that $\widehat{P}$ can be asymptotically transformed into $\widehat{P}^{\prime}$ by an asymptotic $*$ process at a work rate $w$ , which we denote by $(\widehat{P},\widehat{\mathcal{H}})\xrightarrow[*]{w}(\widehat{P}^{\prime},\widehat{\mathcal{H}}^{\prime})$ , if there exists sequences $w_{n},\eta_{n},\varepsilon_{n}$ such that $\hat{\rho}_{n}\xrightarrow[*]{w_{n},\,\eta_{n},\,\varepsilon_{n}}\hat{\rho}_{n}^{\prime}$ for all $n$ and such that

[TABLE]

The spectral rates are monotones under asymptotic transformations:

Proposition 5 (Monotonicity of spectral rates Bowen and Datta (2006a)).

Consider two sequences of states $\widehat{P}=\{\hat{\rho}_{n}\}$ and $\widehat{P}^{\prime}=\{\hat{\rho}^{\prime}_{n}\}$ and two sequences of Hamiltonians $\widehat{\mathcal{H}}=\{\hat{H}_{n}\}$ , $\widehat{\mathcal{H}}^{\prime}=\{\hat{H}^{\prime}_{n}\}$ . Define the sequences of Gibbs weight operators $\widehat{\Sigma}=\{e^{-\beta\hat{H}_{n}}\}$ and $\widehat{\Sigma}^{\prime}=\{e^{-\beta\hat{H}^{\prime}_{n}}\}$ . Let $w\in\mathbb{R}$ be such that $\widehat{P}\xrightarrow[*]{w}\widehat{P}^{\prime}$ where $*$ may stand for either TO or GPM. Then

[TABLE]

Proof.

This follows by applying Proposition 4 and taking the asymptotic limit using the expressions (33) of the asymptotic divergences. ∎

The monotonicity of the spectral rates implies that if a transformation is reversible at a given work cost rate, then that rate is necessarily optimal:

Proposition 6.

Consider two sequences of states $\widehat{P}=\{\hat{\rho}_{n}\}$ and $\widehat{P}^{\prime}=\{\hat{\rho}^{\prime}_{n}\}$ and two sequences of Gibbs weight operators $\widehat{\Sigma}=\{e^{-\beta\hat{H}_{n}}\}$ and $\widehat{\Sigma}^{\prime}=\{e^{-\beta\hat{H}^{\prime}_{n}}\}$ . Then if $w\in\mathbb{R}$ is such that $\widehat{P}\xrightarrow[*]{w}\widehat{P}^{\prime}$ and $\widehat{P}^{\prime}\xrightarrow[*]{-w}\widehat{P}$ , then for all $w^{\prime}<w$ , $\widehat{P}\cancel{\xrightarrow[*]{w^{\prime}\,}}\widehat{P}^{\prime}$ .

This is an expression of the second law of thermodynamics, or Kelvin’s principle, which states that one cannot extract a positive amount of work from a single heat bath by a cyclic protocol.

III.2 State convertibility by thermal operations

We now describe our main theorem for state convertibility by thermal operations. We first derive a sufficient condition for state conversion which is applicable to non-asymptotic cases. We then take the asymptotic limit and obtain a necessary and sufficient condition for asymptotic state conversion. The proofs of these theorems will be provided in the next subsection because of their technical nature.

First, we provide a new sufficient criterion for when a general non-semiclassical state can be approximately reversibly converted to the thermal state using thermal operations. Because thermal operations cannot create superpositions of energy eigenstates, arbitrary state transformations generally require a source of coherence. Here, we show that for any state whose min- and max-divergences are close, only a small source of coherence is needed to carry out a transformation to Gibbs state.

Theorem 1.

Let $\hat{\rho}$ be any quantum state on a system with Hamiltonian $\hat{H}$ , and denote by $\Delta({\hat{H}})$ the spectral range of $\hat{H}$ , i.e., the difference between the maximum and minimum eigenvalue of $\hat{H}$ . Let $\hat{\gamma}^{\prime\prime}=1$ be the trivial thermal state on a trivial system with Hilbert space $\mathbb{C}$ with Hamiltonian $\hat{H}^{\prime\prime}=0$ . Let $0\leqslant\varepsilon<1/100$ . Suppose that there exists $S\in\mathbb{R}$ and $\Delta>0$ such that

[TABLE]

Let $\delta>0$ , $q\geqslant 2$ , and $m=\lceil\Delta({\hat{H}})/\delta\rceil$ . Then we have

[TABLE]

and

[TABLE]

Theorem 1 allows us to prove the emergence of a thermodynamic potential in the macroscopic regime. That is, there is a single quantity that characterizes exactly when a transformation by an asymptotic thermal operation is possible.

Theorem 2.

For sequences of states $\widehat{P}=\{\hat{\rho}_{n}\}$ , $\widehat{P}^{\prime}=\{\hat{\rho}^{\prime}_{n}\}$ and sequences of Hamiltonians $\widehat{\mathcal{H}}=\{\hat{H}_{n}\}$ , $\widehat{\mathcal{H}}^{\prime}=\{\hat{H}^{\prime}_{n}\}$ . Suppose that the spectral rates collapse for these states into a single monotone, i.e.:

[TABLE]

with the sequences $\widehat{\Sigma}=\{e^{-\beta\hat{H}_{n}}\}$ and $\widehat{\Sigma}^{\prime}=\{e^{-\beta\hat{H}^{\prime}_{n}}\}$ . Then

[TABLE]

Equivalently, $\widehat{P}\xrightarrow[\mathrm{TO}]{}\widehat{P}^{\prime}$ if and only if ${S}(\widehat{P}\,\|\,\widehat{\Sigma})\geqslant{S}(\widehat{P}^{\prime}\,\|\,\widehat{\Sigma}^{\prime})$ .

Crucially, these theorems are applicable even if the state is fully quantum. On the other hand, if the state is semiclassical, i.e., if it is block-diagonal in the energy basis, then the condition for state convertibility in Theorem 1 reduces to the known conditions of Refs. Åberg (2013); Horodecki and Oppenheim (2013) in terms of state preparation and work distillation as characterized, e.g., by thermo-majorization. In such cases, no source of coherence is required.

Indeed, for semiclassical states, the min-divergence quantifies the amount of work that can be extracted from a state when transforming it to the thermal state and the max-divergence quantifies the amount of work that is required to prepare the state out of the thermal state. If these divergences collapse, the state is reversibly convertible to and from the thermal states. For quantum states that are not semiclassical, the proof cannot proceed in the same way: Preparing a general state $\hat{\rho}$ starting from the thermal state requires an external source of coherence, and thus the work requirement of state preparation cannot be given by the max-divergence in same way as for semiclassical states. For the proof of Theorem 1 we need the fact that the min and the max divergences collapse approximately in order to conclude that the state can be approximately reversibly transformed to and from the thermal state.

Theorem 2 generalizes and unifies several known situations. For i.i.d. states and Gibbs-preserving maps, our theorem reproduces the results of Ref. Matsumoto (2010). In the case of a trivial Hamiltonian, we recover the results of Ref. Jiao et al. (2018). Our theorem also provides a concrete application of the general results provided in Refs. Weilenmann et al. (2016); Weilenmann (2017); Weilenmann et al. (2018), in the context of the axiomatic thermodynamic framework of Lieb and Yngvason Lieb and Yngvason (1999, 2013).

We note that reversibility only applies to the leading order of the work cost rate and coherence rate. Consider two sequences of states $\widehat{P},\widehat{P}^{\prime}$ that satisfy ${\underline{S}}(\widehat{P}\,\|\,\widehat{\Sigma})={\overline{S}}(\widehat{P}\,\|\,\widehat{\Sigma})={\underline{S}}(\widehat{P}^{\prime}\,\|\,\widehat{\Sigma})={\overline{S}}(\widehat{P}^{\prime}\,\|\,\widehat{\Sigma})$ , which are asymptotically reversibly interconvertible thanks to Theorem 2. It is still in general necessary to invest a sublinear amount of work and coherence in the transformation $\widehat{P}\to\widehat{P}^{\prime}$ which cannot be recovered in general in the reverse transformation $\widehat{P}^{\prime}\to\widehat{P}$ . In our definition of an asymptotic transformation (Definition 7) we deliberately allow sublinear work and coherence costs for this reason, noting that these quantities are negligible with respect to the overall work cost of the transformation.

III.3 Proof of Theorems 1 and 2

Here we provide the proof of Theorem 1 and its asymptotic counterpart, Theorem 2. We proceed in sequential steps through several lemmas: Theorem 1 is proved through Section III.3.1 to Section III.3.4, and Theorem 2 is proved in Section III.3.5.

In order to simplify the notation and ease readability, we omit the hat symbols on operators in this subsection.

III.3.1 Discretizing the Hamiltonian

The first simplification that we do is to change the Hamiltonian from $H$ to a slightly different Hamiltonian $H^{\prime}$ where the eigenvalues are “coarse-grained” into blocks. That is, given $\delta>0$ , we subdivide the spectrum of $H$ into $m=\lceil\Delta({H})/\delta\rceil$ bins of width $\delta$ , where $\Delta({H})$ is the spectral range of $H$ , and we then clamp all eigenvalues in the bin to a single value which is a multiple of $\delta$ . This yields a Hamiltonian $H^{\prime}$ with $[H,H^{\prime}]=0$ and $\lVert{H-H^{\prime}}\rVert_{\infty}\leqslant\delta$ . Furthermore, $H^{\prime}$ only has $m$ distinct eigenvalues, which we denote by $\{E_{k}\}$ ; let also $\{P_{k}\}$ denote the projectors onto the corresponding eigenspaces. We may thus write

[TABLE]

with $E_{k}=(k+k_{0})\delta$ for some fixed $k_{0}\in\mathbb{Z}$ .

Physically, the transformation $H\to H^{\prime}$ can be done by turning on a perturbation of magnitude at most $\delta$ . Furthermore, the perturbation commutes with the original Hamiltonian.

We note that ${e}^{-\beta H}\leqslant{e}^{-\beta H^{\prime}+\beta\delta}$ and ${e}^{-\beta H^{\prime}}\leqslant{e}^{-\beta H+\beta\delta}$ , where the operator inequalities hold because both sides commute with each other. This implies that, for any $\rho$ and for any $\varepsilon>0$ , we have

[TABLE]

We also define the dephasing operation for any Hermitian operator $X$ as a pinching in the energy blocks:

[TABLE]

The following proposition asserts that this perturbation $H_{S}\to H^{\prime}_{S}$ can be carried out with a $(0,(q^{2}+1)\delta)$ -work/coherence-assisted thermal operation, for any value of $q>0$ which impacts the accuracy of the process as $1/q$ .

Proposition 7.

Consider a system $S$ with Hamiltonian $H_{S}$ and a copy $S^{\prime}\simeq S$ with a Hamiltonian $H^{\prime}_{S^{\prime}}$ . Suppose that $[H^{\prime}_{S^{\prime}},{\mathrm{id}}_{S\to S^{\prime}}(H_{S})]=0$ and let $\delta\geqslant 0$ such that $\bigl{\lVert}{{\mathrm{id}}_{S\to S^{\prime}}(H_{S})-H^{\prime}_{S^{\prime}}}\bigr{\rVert}_{\infty}\leqslant\delta$ . Then for any $q>0$ there exists a $(0,(q^{2}+1)\delta)$ -work/coherence-assisted transformation $\Phi_{S\to S^{\prime}}$ such that for any state $\rho_{SR}$ (with any reference system $R$ ), we have

[TABLE]

Proof.

Let $\lvert{k}\rangle_{S}$ be a simultaneous eigenbasis of ${\mathrm{id}}_{S^{\prime}\to S}(H^{\prime}_{S^{\prime}})$ and of $H_{S}$ , and write $\lvert{k}\rangle_{S^{\prime}}={\mathrm{id}}_{S\to S^{\prime}}(\lvert{k}\rangle_{S})$ . Then $H_{S}\lvert{k}\rangle_{S}=E_{k}\lvert{k}\rangle_{S}$ and $H^{\prime}_{S^{\prime}}\lvert{k}\rangle_{S^{\prime}}=E^{\prime}_{k}\lvert{k}\rangle_{S^{\prime}}$ for corresponding eigenvalues $E_{k}$ and $E^{\prime}_{k}$ including multiplicities, i.e., the $E_{k}$ (resp. $E^{\prime}_{k}$ ) need not be all different. The condition $\lVert{{\mathrm{id}}_{S\to S^{\prime}}(H_{S})-H^{\prime}_{S^{\prime}}}\rVert_{\infty}\leqslant\delta$ implies that $\lvert{E_{k}-E^{\prime}_{k}}\rvert\leqslant\delta$ .

Let $L:=q^{2}\delta$ . Let $C$ , $C^{\prime}$ be a particle on the intervals $[0,L]$ , $[-\delta,L+\delta]$ in $\mathbb{R}$ , respectively, which are described by the Hilbert spaces $L^{2}([0,L])$ , $L^{2}([-\delta,L+\delta])$ . There are natural embeddings $L^{2}([0,L])\subset L^{2}([-\delta,L+\delta])\subset L^{2}(\mathbb{R})$ .

Let $\chi_{I}(x)$ be the indicator function for a closed interval $I\subset\mathbb{R}$ . We define the Hamiltonians of $C$ and $C^{\prime}$ by $H_{C}:=x\chi_{[0,L]}(x)$ and $H_{C^{\prime}}:=x\chi_{[-\delta,L+\delta]}(x)$ , which are regarded as self-adjoint operators acting on $L^{2}([0,L])$ and $L^{2}([-\delta,L+\delta])$ , respectively. Obviously, $\|H_{C}\|_{\infty}=L$ , $\|H_{C^{\prime}}\|_{\infty}=L+\delta$ .

We also define the initial state of $C$ by $\zeta(x):=\chi_{[0,L]}(x)/\sqrt{L}\in L^{2}([0,L])$ . We can also regard $\zeta(x)$ as an element of $L^{2}([-\delta,L+\delta])$ , for which we use the same notation.

For $a\in\mathbb{R}$ with $|a|\leqslant\delta$ , we define the translation operator $V(a):L^{2}([0,L])\to L^{2}([-\delta,L+\delta])$ by $V(a)\varphi(x):=\varphi(x-a)$ . This is an isometry, where its adjoint $V(a)^{\dagger}$ is defined on $L^{2}([-\delta,L+\delta])$ by $V(a)^{\dagger}\psi(x)=\chi_{[0,L]}(x)\psi(x+a)$ for $\psi(x)\in L^{2}([-\delta,L+\delta])$ , because

[TABLE]

Now we define the isometry

[TABLE]

We can show that $V_{SC\to S^{\prime}C^{\prime}}(H_{S}+H_{C})=(H^{\prime}_{S^{\prime}}+H_{C^{\prime}})V_{SC\to S^{\prime}C^{\prime}}$ by acting with $V_{SC\to S^{\prime}C^{\prime}}$ on $\lvert{k}\rangle_{S}\otimes\varphi(x)$ for any $\varphi(x)\in L^{2}([0,L])$ . Then, we define the CP and trace-nonincreasing map

[TABLE]

By construction, $\Phi_{S\to S^{\prime}}$ is a $(0,(q^{2}+1)\delta)$ -work/coherence-assisted thermal operation.

Let $\rho_{SR}$ be any state with any reference system. Without loss of generality, assume that $\rho_{SR}$ is in fact a pure state (or consider a larger reference system $R$ ; the statement will still hold because trace distance can only decrease under partial trace). We remark that the fidelity and the trace distance can be defined for infinite-dimensional and Hilbert spaces, and satisfy the same fundamental properties as in finite dimensions Belavkin et al. (2005); Furrer et al. (2011). Then, with $\rho_{S^{\prime}R}={\mathrm{id}}_{S\to S^{\prime}}(\rho_{SR})$ ,

[TABLE]

where the term on $C$ is real because $\zeta(x)$ is real. We can calculate for $|a|\leqslant\delta$

[TABLE]

Hence, since $\lvert{E_{k}-E^{\prime}_{k}}\rvert\leqslant\delta$ ,

[TABLE]

Recalling that $D(\rho,\rho^{\prime})\leqslant\sqrt{1-F^{2}(\rho,\rho^{\prime})}$ , and that the fidelity can only increase under partial trace, we have

[TABLE]

∎

III.3.2 Manipulating coherence in the state

For any state $\rho$ on any system with any Hamiltonian $H$ , we can decompose $\rho$ into modes of coherence Lostaglio et al. (2015a) as

[TABLE]

where $\rho^{(\omega)}$ are general operators satisfying

[TABLE]

for all $t$ . The $\rho^{(\omega)}$ are simply the off-diagonal elements of $\rho$ that connect two energy levels that differ by $\omega$ . For the Hamiltonian $H^{\prime}$ constructed in (47), with only energies that are multiples of $\delta$ , we have that the $\omega$ in (58) range over all possible differences of energies in $H^{\prime}$ , i.e., over all multiples of $\delta$ .

The following lemma states that if the large coherence modes in the state are suppressed, then it is possible to carry out the dephasing operation by mixing only a few differently time-evolved versions of $\rho$ .

Lemma 2.

Let $\rho$ be any state on any system with a Hamiltonian $H^{\prime}$ whose energies are multiples of $\delta$ as in (47). Let $\rho^{(\omega)}$ denote the coherence modes in the decomposition of $\rho$ as above. Let $K^{\prime}>0$ . Suppose that there exists $\xi>0$ such that for all $k$ with $\lvert{k}\rvert\geqslant K^{\prime}$ we have

[TABLE]

Define

[TABLE]

Then, if $m$ denotes the number of distinct eigenvalues of $H^{\prime}$ , we have that

[TABLE]

Proof.

For any $t>0$ , we write

[TABLE]

such that

[TABLE]

Recall that $\omega$ in the modes decomposition of $\rho$ is a multiple of $\delta$ and ranges over all off-diagonals of $\rho$ ; i.e., $\omega=k\delta$ for $k=-m+1,\ldots,m-1$ . Furthermore, we may split the sum over the modes as a sum over modes in $k=-K^{\prime}+1,...,K^{\prime}-1$ and a separate sum over the higher order modes. We can thus calculate:

[TABLE]

where we recall that $\mathcal{D}_{H^{\prime}}(\rho)=\rho^{(\omega=0)}$ and where we have defined $G$ as the second sum in the before-to-last line. We can bound the norm of $G$ as follows:

[TABLE]

where $m$ is a crude upper bound for the total number of terms in the first sum, and where each term $\lVert{\rho^{(k\delta)}}\rVert_{1}$ is individually bounded thanks to the assumption (60). We may conclude that $\bar{\rho}$ and $\mathcal{D}_{H^{\prime}}(\rho)$ are close in trace distance:

[TABLE]

∎

Importantly, the min- and max-divergences are only known to quantify the extractable work and the work cost of formation for semiclassical states, i.e., those that commute with the Hamiltonian. For states that are not semiclassical, we need a more general statement. Here, we show a lemma that shows that the min- and max-divergences also accurately quantify the extractable work and the work cost of formation for general quantum states, as long as their large coherence modes are suppressed.

Lemma 3.

Let $\rho$ be any quantum state on a system with a Hamiltonian $H^{\prime}$ whose energies are multiples of $\delta$ as in (47), and let $c\geqslant\beta\delta$ . Let $\hat{\gamma}^{\prime\prime}=\lvert{0}\rangle\langle{0}\rvert$ be the thermal state of a trivial system with Hamiltonian $H^{\prime\prime}=0$ as in Theorem 1. Suppose that there exists $\xi^{\prime}>0$ such that for any $k,k^{\prime}$ with $\beta\lvert{E_{k}-E_{k^{\prime}}}\rvert\geqslant c$ we have

[TABLE]

Then, for any $\varepsilon^{\prime}\geqslant m^{2}\xi^{\prime}$ , we have

[TABLE]

Conversely, for any integer $q>0$ , we have

[TABLE]

Proof.

First, note that (67) asserts that the coherence modes $\rho^{(\omega)}$ of $\rho$ are small for large $\omega$ . More precisely: Let $K=\lceil c/(\beta\delta)\rceil$ , such that $(\lvert{k-k^{\prime}}\rvert\geqslant K)\Rightarrow(\beta\lvert{E_{k}-E_{k^{\prime}}}\rvert\geqslant c)$ . Then for all $\omega=k\delta$ such that $\lvert{k}\rvert\geqslant K$ , we have

[TABLE]

because the coherence modes are simply the combination of all the blocks in the $k$ -th off-diagonal of $\rho$ , whose individual norm is bounded by our assumption (67). We may invoke Lemma 2 to deduce that

[TABLE]

where $\bar{\rho}$ is defined in Lemma 2 with $K^{\prime}=K$ and $\xi=m\xi^{\prime}$ .

Work extraction from $\rho$ . Now we construct a strategy to transform $\rho$ into the trivial thermal state $\gamma^{\prime\prime}$ . First, we decohere the state in the energy blocks, effecting the transformation $\rho\to\mathcal{D}_{H^{\prime}}(\rho)$ at no work nor coherence cost (this can be done by averaging over time, which is a thermal operation). Then we apply the incoherent work extraction protocol (Proposition 15 in Appendix 0.B) to transform $\mathcal{D}_{H^{\prime}}(\rho)\to\gamma^{\prime\prime}$ with an error parameter $\varepsilon^{\prime}\geqslant m^{2}\xi^{\prime}$ , while extracting an amount of work equal to ${S}_{0}^{\varepsilon^{\prime}}(\mathcal{D}_{H^{\prime}}(\rho)\,\|\,{e}^{-\beta H^{\prime}})$ , and at no coherence cost. Hence, we have $\rho\xrightarrow[\mathrm{TO}]{-\beta^{-1}{S}_{0}^{\varepsilon^{\prime}}(\mathcal{D}_{H^{\prime}}(\rho)\,\|\,{e}^{-\beta H^{\prime}}),\;0,\;\varepsilon^{\prime}}\gamma^{\prime\prime}$ . Using Proposition 2, observe that

[TABLE]

since $\bar{\rho}$ is a candidate in the optimization that defines the smooth min-divergence. Then we invoke the property of the fidelity that $F(A+B,C)\leqslant F(A,C)+F(B,C)$ (cf. (Audenaert and Mosonyi, 2014, Lemma 4.9)), to see that

[TABLE]

With the crude bound $K\leqslant m$ we finally see that

[TABLE]

which shows (68).

Formation of the state $\rho$ . We now devise a procedure to construct the state $\rho$ starting from the trivial thermal state $\gamma^{\prime\prime}$ . In the following, we refer to the system as $S$ , and write $\rho$ and $H^{\prime}$ as $\rho_{S}$ and $H^{\prime}_{S}$ .

The full protocol consists in three steps. The strategy will be to prepare a completely incoherent state $\mathcal{D}_{H^{\prime}_{S}+H_{C}}(\rho_{S}\otimes\eta_{C})$ on the system $S$ along with an ancilla system $C$ in such a way that the system $C$ serves as a reference frame that can be used to induce coherence in $S$ . Then, in the second and third steps, we “externalize” the reference frame by using $C$ to “induce” the necessary coherence modes in $S$ Bartlett et al. (2007).

Let $q>0$ be an integer. Let $C$ be an ancilla system of dimension $d_{C}=qK^{2}$ and with a Hamiltonian consisting of evenly $\delta$ -spaced levels, i.e., $H_{C}=\sum_{\ell=0}^{d_{C}-1}\ell\delta\,\lvert{\ell}\rangle\langle{\ell}\rvert_{C}$ . Define the state $\eta_{C}=\lvert{\eta}\rangle\langle{\eta}\rvert_{C}$ by

[TABLE]

By $\mathcal{D}_{H^{\prime}_{S}+H_{C}}$ we will denote the joint dephasing operation on $S$ and $C$ , i.e., the dephasing in the common global energy eigenspaces of $H^{\prime}_{S}+H_{C}$ .

In the first step of the protocol, starting from the trivial thermal state on $S\otimes C$ , we prepare the state $\mathcal{D}_{H^{\prime}_{S}+H_{C}}(\rho_{S}\otimes\eta_{C})$ at a cost given by the max-divergence

[TABLE]

We can bound this as follows. The max-divergence can only decrease under the dephasing operation; we have ${e}^{-\beta(H^{\prime}_{S}+H_{C})}={e}^{-\beta H^{\prime}_{S}}\otimes{e}^{-\beta H_{C}}\geqslant{e}^{-\beta d_{C}\delta}\,{e}^{-\beta H^{\prime}_{S}}\otimes I_{C}$ because $H_{C}\leqslant d_{C}\delta I_{C}$ with $I_{C}$ being the identity operator of $C$ ; finally, the max-divergence is additive for tensor product states. This gives us

[TABLE]

noting that ${S}_{\infty}(\eta_{C}\,\|\,I_{C})=0$ because $\eta_{C}$ is a pure state. Therefore:

[TABLE]

The next steps are to “consume” $C$ in order to induce $\rho_{S}$ on the system $S$ (we need to externalize the reference frame). This is done as follows.

In preparation for the further steps, we first note that if we post-select the reference frame in being in the state $\lvert{\eta}\rangle_{C}$ , then we induce the correct state on $S$ , approximately. This is shown as follows:

[TABLE]

where we used the fact that $\operatorname{tr}(A^{(\omega)}B^{(\omega^{\prime})})=0$ unless $\omega=-\omega^{\prime}$ , and that $\operatorname{tr}\bigl{(}\eta_{C}^{(-k\delta)}\eta_{C}^{(k\delta)}\bigr{)}=(d_{C}-\lvert{k}\rvert)/d_{C}^{2}$ since $\eta_{C}^{(k\delta)}$ is the matrix of all zeros except for the $k$ -th off-diagonal in which all entries are equal to $1/d_{C}$ . Then

[TABLE]

where in the last line we used (70). Let $M^{(K)}$ be the matrix in which the $k$ -th off-diagonal is filled with the entries equal to $\lvert{k}\rvert$ , up to the $(K-1)$ -th off-diagonal, and the remaining matrix elements are zero. Then we note that

[TABLE]

where $A*B$ denotes the Hadamard (entry-wise) product. We note that $\lVert{A*B}\rVert_{1}\leqslant\lVert{A}\rVert_{\infty}\lVert{B}\rVert_{1}$ , and that $\lVert{M^{(K)}}\rVert_{\infty}\leqslant K^{2}$ (Suppl. Lemmas 3 and 4 of Åberg (2014), originally from Horn and Johnson (1985)). Hence, $\lVert{M^{(K)}*\rho_{S}}\rVert_{1}\leqslant K^{2}$ and we finally have

[TABLE]

We also note that $\lvert{\eta}\rangle_{C}$ passes through orthogonal states for each time steps $2\pi/(d_{C}\delta)$ . Actually, for $n=0,\ldots,d_{C}-1$ , the set $\bigl{\{}\lvert{n}\rangle_{C}\bigr{\}}_{n}$ forms an orthonormal basis of $C$ , where $\lvert{n}\rangle_{C}={e}^{-i\frac{2\pi n}{d_{C}\delta}H_{C}}\lvert{\eta}\rangle_{C}$ . Indeed,

[TABLE]

Step 2 of our protocol consists in flattening the Hamiltonian of $C$ so that we can perform nontrivial unitaries without worrying about coherences. From the state $\mathcal{D}_{H^{\prime}_{S}+H_{C}}(\rho_{S}\otimes\eta_{C})$ with Hamiltonian $H^{\prime}_{S}+H_{C}$ , we “flatten” the Hamiltonian of the ancilla system $C$ using (Faist et al., 2021, Lemma 8.1) and consuming an additional ancilla $C^{\prime}$ of dimension $d_{C}(q^{2}+2)$ , with the Hamiltonian $H_{C^{\prime\prime}}$ being bounded as $\|H_{C^{\prime}}\|\leqslant d_{C}(q^{2}+2)\delta$ and with the original state surviving up to precision $1/q$ . That is, we achieve the following Hamiltonian transformation

[TABLE]

Finally, in Step 3 we carry out the following energy-conserving unitary controlled on the system $C$ :

[TABLE]

and we then use Landauer erasure to reset $C$ to a pure state and to trace it out. Note that ${e}^{-iH^{\prime}_{S}t}\,\mathcal{D}_{H^{\prime}_{S}+H_{C}}(\rho_{S}\otimes\eta_{C})\,{e}^{iH^{\prime}_{S}t}={e}^{-iH^{\prime}_{S}t}\,{e}^{i(H^{\prime}_{S}+H_{C})t}\,\mathcal{D}_{H^{\prime}_{S}+H_{C}}(\rho_{S}\otimes\eta_{C})\,{e}^{-i(H^{\prime}_{S}+H_{C})t}\,{e}^{iH^{\prime}_{S}t}={e}^{iH_{C}t}\,\mathcal{D}_{H^{\prime}_{S}+H_{C}}(\rho_{S}\otimes\eta_{C})\,{e}^{-iH_{C}t}$ because the dephased state is invariant under time evolution. Then, the application of the unitary $U_{SC}$ to $\mathcal{D}_{H^{\prime}_{S}+H_{C}}(\rho_{S}\otimes\eta_{C})$ , and tracing out $C$ , yields

[TABLE]

Recalling (81), we know that this state is close to the required $\rho_{S}$ . Noting that we need $\beta^{-1}\ln(d_{C})$ work to reset $C$ to a pure state, we find:

[TABLE]

Note that the final uniform Hamiltonian on the system $C$ can be restored to the original Hamiltonian at no work or coherence cost, by keeping the state of $C$ at a pure state of constant energy and changing the other levels to match those of the original Hamiltonian $H_{C}$ .

Combining together these three steps, we see that

[TABLE]

Recalling $K=\lceil c/(\beta\delta)\rceil\leqslant 2c/(\beta\delta)$ while assuming $c\geqslant\beta\delta$ , we obtain (69). ∎

III.3.3 Collapse of the min and max divergences suppresses coherence

Here we show that the difference between (alternative) min-divergence and the max-divergence is a quantity that provides a characterization of how much coherence there is in the state. Namely, if the divergences do not differ by more than $2\Delta^{\prime}$ , then the one-norm of off-diagonal energy blocks $P_{k}\hat{\rho}P_{k^{\prime}}$ is exponentially suppressed in $\lvert{E_{k}-E_{k^{\prime}}}\rvert$ as long as $\lvert{E_{k}-E_{k^{\prime}}}\rvert\gtrsim\Delta^{\prime}$ .

Lemma 4.

Let $\rho$ be a quantum state. Suppose there are $S\in\mathbb{R}$ and $\Delta^{\prime}>0$ such that

[TABLE]

Then for any $k,k^{\prime}$ , we have

[TABLE]

Proof.

Using Hölder’s inequality, we have

[TABLE]

By definition of the Rényi-1/2 divergence, we have for any $k$ ,

[TABLE]

and hence

[TABLE]

On the other hand, we have

[TABLE]

recalling that the square of the largest singular value of a matrix $A$ is the maximum eigenvalue of $AA^{\dagger}$ . Putting these together, and noting that the same argument holds if we exchange $k$ and $k^{\prime}$ , we obtain

[TABLE]

as claimed. ∎

III.3.4 Proof of Theorem 1

Finally, we can prove Theorem 1. If the smooth min and max Rényi divergences coincide approximately, we use the above lemmas to conclude that there exist protocols for work distillation and state formation with approximately matching work costs. The difficult part of the proof is to show that there is a single state that is a good enough smoothing candidate simultaneously in both (16a) and (16b).

Proof.

First, we need to connect the assumption on the smoothed entropy measures to a specific state which has a small gap between its non-smoothed min and max-divergences. Our specific goal below is to construct a state $\tilde{\rho}$ that satisfies the conditions of Lemma 4 and is sufficiently close to $\rho$ .

Because $H^{\prime}\leqslant H+\delta$ and $H\leqslant H^{\prime}+\delta$ , we have

[TABLE]

Both protocols, work extraction and state formation, start by shifting the Hamiltonian $H\to H^{\prime}$ , and at the end shifting the Hamiltonian back $H^{\prime}\to H$ . Thanks to Proposition 7, this can be done at a cost in the total coherence parameter of $(q^{2}+1)\delta$ and at a precision cost $1/q$ in each way.

Let $\rho^{\prime}$ be the optimal subnormalized quantum state for ${S}_{1/2}^{\varepsilon}(\rho\,\|\,{e}^{-\beta H^{\prime}})={S}_{1/2}(\rho^{\prime}\,\|\,{e}^{-\beta H^{\prime}})$ , satisfying $D(\rho,\rho^{\prime})\leqslant\varepsilon$ and $\operatorname{tr}(\rho^{\prime})\geqslant 1-\varepsilon$ .

Let $\gamma^{\prime}={e}^{-\beta H^{\prime}}/\operatorname{tr}({e}^{-\beta H^{\prime}})$ and write

[TABLE]

Let $\alpha$ , $F$ denote optimal choices in the last optimization. Let

[TABLE]

where $\mathcal{D}_{H^{\prime}}(\cdot)$ denotes the dephasing operation in the eigenspaces of $H^{\prime}$ . Then, using the pinching inequality, and because $G$ commutes with time evolution,

[TABLE]

and thus ${S}_{\infty}(G\rho^{\prime}G^{\dagger}\,\|\,\gamma^{\prime})\leqslant\ln(m)+\ln(\alpha)\leqslant\ln(m)+{S}_{\infty}^{2\varepsilon}(\rho^{\prime}\,\|\,\gamma^{\prime})$ . Shifting back the normalization of the second argument gives

[TABLE]

because the optimal state in the last max-divergence is a candidate in the optimization for ${S}_{\infty}^{2\varepsilon}(\rho^{\prime}\,\|\,{e}^{-\beta H^{\prime}})$ . Also, taking the trace of the constraint $\rho^{\prime}\leqslant\alpha\gamma^{\prime}+F$ we obtain $\alpha\geqslant 1-4\varepsilon$ , and then using (Dupuis et al., 2013, Lemma A.4), we have $P(G\rho^{\prime}G^{\dagger}/\operatorname{tr}(\rho^{\prime}),\rho^{\prime}/\operatorname{tr}(\rho^{\prime}))\leqslant\sqrt{2\operatorname{tr}(\alpha^{-1}\mathcal{D}[F])/\operatorname{tr}(\rho^{\prime})}\leqslant 2\sqrt{\varepsilon/[(1-4\varepsilon)(1-\varepsilon)]}\leqslant 4\sqrt{\varepsilon}$ (using $\varepsilon\leqslant 1/8$ ), where $P(\sigma,\sigma^{\prime}):=\sqrt{1-F(\sigma,\sigma^{\prime})^{2}}\geqslant D(\sigma,\sigma^{\prime})$ is the purified distance for $\sigma,\sigma^{\prime}\in{\mathcal{S}}(\mathscr{H})$ . Hence, $D(G\rho^{\prime}G^{\dagger}/\operatorname{tr}(\rho^{\prime}),\rho^{\prime}/\operatorname{tr}(\rho^{\prime}))\leqslant 4\sqrt{\varepsilon}$ and thus $D(G\rho^{\prime}G^{\dagger},\rho^{\prime})\leqslant 4\operatorname{tr}(\rho^{\prime})\sqrt{\varepsilon}\leqslant 4\sqrt{\varepsilon}$ .

On the other hand, we have

[TABLE]

using Hölder’s inequality. Conveniently, $[G,\gamma^{\prime}]=0$ by construction, and thus also $[G,\gamma^{\prime 1/2}]=0$ and $[G,\gamma^{\prime-1/2}]=0$ , and $\bigl{\lVert}{\gamma^{\prime-1/2}G^{\dagger}\gamma^{\prime 1/2}}\bigr{\rVert}_{\infty}=\bigl{\lVert}{G}\bigr{\rVert}_{\infty}\leqslant 1$ , since $G$ is a contraction (because $G^{\dagger}G\leqslant I$ ). Hence

[TABLE]

and thus

[TABLE]

Finally, we define

[TABLE]

We have $\operatorname{tr}(G\rho^{\prime}G^{\dagger})\geqslant 1-\varepsilon-4\sqrt{\varepsilon}\geqslant 1-5\sqrt{\varepsilon}$ , and thus

[TABLE]

and by a chain of triangle inequalities

[TABLE]

We can define $\Delta^{\prime}=\Delta+\beta\delta+\ln(m)-\ln(1-5\sqrt{\varepsilon})$ , while noting that $-\ln(1-5\sqrt{\varepsilon})\leqslant\ln(2)$ as $\varepsilon<1/100$ . Then, the state $\tilde{\rho}$ satisfies

[TABLE]

We then have $\Delta^{\prime}\leqslant\Delta+\beta\delta+\ln(2m)$ and $\Delta^{\prime}\geqslant\Delta+\beta\delta+\ln(m)$ .

Then, the conditions of Lemma 4 are fulfilled, and for any $k,k^{\prime}$ , we have that

[TABLE]

Now, for any $r>1$ we set $c=r\Delta^{\prime}$ . For any $k,k^{\prime}$ with $\lvert{k-k^{\prime}}\rvert\beta\delta\geqslant c$ , Equation 112 tells us that $\bigl{\lVert}{P_{k}\tilde{\rho}P_{k^{\prime}}}\bigr{\rVert}_{1}\leqslant{e}^{-(r-1)\Delta^{\prime}}=:\xi^{\prime}$ . We set $r=2$ in the following for convenience.

The conclusions of Lemma 3 apply to the interconversion of $\tilde{\rho}$ to and from the thermal state.

Distilling work from $\rho$ . Work can be distilled, i.e., the transition $\rho\to\gamma^{\prime\prime}$ is possible, with the parameters (we have set $\varepsilon^{\prime}=\sqrt{\varepsilon}$ in Lemma 3)

[TABLE]

Preparing the state $\rho$ . The state $\rho$ can be prepared, i.e., the transition $\gamma^{\prime\prime}\to\rho$ is possible, with the parameters

[TABLE]

Finally, letting $q\geqslant 2$ , we obtain the slightly simplified parameters in Theorem 1.

∎

III.3.5 Proof of Theorem 2

We now present the proof of Theorem 2, the main theorem of the first part of our main result. The proof proceeds by applying Theorem 1 in the thermodynamic limit.

Proof.

We use Theorem 1 to show asymptotic convertibility of $\widehat{P}$ (relative to $\widehat{\Sigma}$ ) to and from the Gibbs state $\gamma^{\prime\prime}$ on a trivial system at zero energy. We write $\widehat{\Sigma}^{\prime\prime}=\{\gamma^{\prime\prime}\}$ the trivial sequence of trivial Gibbs states. For $\varepsilon>0$ , let

[TABLE]

and let $\Delta_{\infty,\varepsilon}:=\limsup_{n\to\infty}\Delta_{n,\varepsilon}/n$ . We have

[TABLE]

For $\varepsilon>0$ and for each $n$ , we apply Theorem 1 with the choices $S=S_{n,\delta}$ , $\Delta=\Delta_{n,\delta}$ , $\delta=\beta^{-1}\Delta_{n,\varepsilon}$ and $q=(\Delta_{\infty,\varepsilon})^{-1/4}$ . Then $m=O(\operatorname{poly}(n))/\Delta_{n,\varepsilon}$ . Observe that $\Delta_{n,\varepsilon}=O(n)$ and that $\Delta_{n,\varepsilon}$ increases at least as fast as $\sqrt{n}$ by definition; thus $m=O(\operatorname{poly}(n))$ . Let $w_{n,\varepsilon},\eta_{n,\varepsilon},\bar{\varepsilon}_{n,\varepsilon}$ be the parameters of the work extraction process given by Theorem 1 for these choices. Then

[TABLE]

and we can apply Lemma 13 in Appendix 0.A to conclude that $\widehat{P}\xrightarrow[\mathrm{TO}]{-\beta^{-1}\bar{S}}\widehat{\Sigma}^{\prime\prime}$ .

For the work extraction process, we define $S^{\prime}_{n,\varepsilon}$ , $\bar{S}^{\prime}$ , $\Delta^{\prime}_{n,\varepsilon}$ , and $\Delta^{\prime}_{\infty,\varepsilon}$ similarly. Then the parameters $w^{\prime}_{n,\varepsilon},\,\eta^{\prime}_{n,\varepsilon},\bar{\varepsilon}^{\prime}_{n,\varepsilon}$ given by Theorem 1 satisfy

[TABLE]

where we used the fact that sublinear terms are suppressed, that $\limsup_{n\to\infty}[\Delta^{\prime}_{n,\varepsilon}+\beta\delta+\ln(am^{b})]/n=2\Delta^{\prime}_{\infty,\varepsilon}$ for any $a,b\geqslant 0$ , and that $\limsup_{n\to\infty}m^{2}e^{-(\Delta+\delta+\ln(m))}/n=0$ because $\Delta^{\prime}_{n,\varepsilon}+\delta+\ln(m)$ grows at least as fast as $\sqrt{n}$ and the exponential takes over the polynomial. Thus from Lemma 13 in Appendix 0.A we see that $\widehat{\Sigma}^{\prime\prime}\xrightarrow[\mathrm{TO}]{\beta^{-1}\bar{S}}\widehat{P}$ . Combining these two processes for different states immediately yields $\widehat{P}\xrightarrow[\mathrm{TO}]{\beta^{-1}[{S}(\widehat{P}^{\prime}\,\|\,\widehat{\Sigma}^{\prime})-{S}(\widehat{P}\,\|\,\widehat{\Sigma})]}\widehat{P}^{\prime}$ .

It is clear that if ${S}(\widehat{P}\,\|\,\widehat{\Sigma})\geqslant{S}(\widehat{P}^{\prime}\,\|\,\widehat{\Sigma}^{\prime})$ , then $\widehat{P}\xrightarrow[\mathrm{TO}]{}\widehat{P}^{\prime}$ from the above, using Property (e) of Proposition 14 in Appendix 0.A. Also, if $\widehat{P}\xrightarrow[\mathrm{TO}]{}\widehat{P}^{\prime}$ , then monotonicity of the spectral rates imply that ${S}(\widehat{P}\,\|\,\widehat{\Sigma})\geqslant{S}(\widehat{P}^{\prime}\,\|\,\widehat{\Sigma}^{\prime})$ . ∎

IV Collapse of the min and max divergences for ergodic states relative

to local Gibbs states

In this section, we prove the second main theorem of our main result (Theorem 3): For any $\widehat{P}$ that is translation-invariant ergodic and for any local translation-invariant Gibbs state $\widehat{\Sigma}$ , then we have ${\overline{S}}(\widehat{P}\,\|\,\widehat{\Sigma})={\underline{S}}(\widehat{P}\,\|\,\widehat{\Sigma})={S}_{1}(\widehat{P}\,\|\,\widehat{\Sigma})$ . Combined with Theorem 2, this implies that all such states can be reversibly converted into one another with thermal operations and a negligible amount of coherence.

We prove this assertion in two steps. First, we formulate a generalized version of Stein’s lemma Nagaoka and Ogawa (2000); Bjelakovic and Siegmund-Schultze (2004); Nagaoka and Hayashi (2007); Bjelakovic and Siegmund-Schultze (2003); Brandão and Plenio (2010). We derive a sufficient condition for the min and max divergence converge to the same value that is heavily inspired by these references. The condition is the existence of an operator obeying a simple set of properties, that plays the role of a typical projector. In a second step, we prove that for ergodic states and local Gibbs translation-invariant states, this condition is fulfilled.

IV.1 A sufficient condition for quantum Stein’s lemma

The quantum Stein’s lemma relates to a hypothesis test between two states $\hat{\rho}_{n}$ and $\hat{\sigma}_{n}$ using a single measurement. If we employ the optimal strategy that correctly reports $\hat{\rho}_{n}$ with probability at least $\eta$ , then the probability of erroneously reporting $\hat{\rho}_{n}$ decreases exponentially as $\exp(-n{S}_{1}(\hat{\rho}_{n}\,\|\,\hat{\sigma}_{n}))$ , with the rate being given by the KL divergence. This statement holds in several known cases, such as for i.i.d. states, or if $\hat{\rho}_{n}$ is ergodic and $\hat{\sigma}_{n}$ is i.i.d. Bjelakovic and Siegmund-Schultze (2004).

Quantum Stein’s lemma can be formulated in terms of the hypothesis testing divergence. For sequences $\widehat{P}$ , $\widehat{\Sigma}$ , a quantum Stein’s lemma would state that for all $0<\eta<1$ ,

[TABLE]

Because the hypothesis testing divergence is monotonic in $\eta$ , and because it interpolates between the min and max divergences [cf. Eq. 33], we see that the hypothesis testing divergence converges to the KL divergence as per (125), if and only if the min and max divergences converge to the KL divergence,

[TABLE]

Therefore, to prove (126) for a class of states it suffices to prove (125).

A simplest situation where the quantum Stein’s lemma holds is the i.i.d. setting, i.e., $\widehat{P}:=\{\hat{\rho}^{\otimes n}\}$ and $\widehat{\Sigma}:=\{\hat{\sigma}^{\otimes n}\}$ . In this situation, for any $0<\eta<1$ ,

[TABLE]

and consequently,

[TABLE]

as was proved in (Nagaoka and Hayashi, 2007, Theorem 2).

We now derive a sufficient condition for the convergence (125), providing a generalization of the quantum Stein’s lemma beyond i.i.d. states.

Lemma 5.

Let $\widehat{P}$ and $\widehat{\Sigma}$ be any sequences of states. Suppose that there exists $c\in\mathbb{R}$ such that for any $\varepsilon>0$ , there exists a sequence of operators $\hat{W}_{n}^{\varepsilon}$ that satisfy, for sufficiently large $n$ ,

[TABLE]

Then, for any $0<\eta<1$ , we have

[TABLE]

Our proof is based on tools from semidefinite programming Watrous (2009); Dupuis et al. (2013); Faist and Renner (2018), which imply that the hypothesis testing divergence is equivalently expressed using two different optimizations:

[TABLE]

The optimizations are called the primal problem and dual problem respectively. We note that our proof below only requires the so-called weak duality between the minimization and the maximization, which states that the optimal value of the minimization problem is an upper bound to the optimal value of the maximization problem.

The reason that we have equality in (131) is that for the hypothesis testing divergence, the stronger notion of strong duality holds, which states that both optimization problems have the same optimal value. We note that the reason we write a supremum for the dual problem is that for $\eta=1$ , even as strong duality holds, we are not guaranteed that the supremum is achieved by a specific choice of $\mu$ and $\hat{X}$ . In the primal problem the minimum is always achieved. This can be seen using Slater’s conditions Watrous (2009), noting that we can restrict the optimization to the support of $\hat{\sigma}$ .

Proof of Lemma 5.

Our proof proceeds by exhibiting explicit candidates in both optimizations in (131), yielding upper and lower bounds that both converge to $c$ as $n\to\infty$ .

Let $\hat{Q}_{n}^{\varepsilon}:=\hat{W}_{n}^{\varepsilon}{}^{\dagger}\hat{W}_{n}^{\varepsilon}$ . From condition (129d) and Lemma 9 (a) in Appendix 0.A, we have

[TABLE]

which implies that for any $0<\eta<1$ , we have $\operatorname{tr}\bigl{[}\hat{Q}_{n}^{\varepsilon}\hat{\rho}_{n}\bigr{]}>\eta$ for sufficiently large $n$ , and $\hat{Q}_{n}$ is a valid optimization candidate in (131). Using (129b), the value attained by this candidate is

[TABLE]

and thus

[TABLE]

By taking $n\to\infty$ and then $\varepsilon\to+0$ , we conclude that ${S}_{\mathrm{H}}^{\eta}(\widehat{P}\,\|\,\widehat{\Sigma})\geqslant c$ .

Now we consider the second optimization in (131). First, we note that using a generalization of the Pinching inequality (Lemma B.1 of Ref. Faist et al. (2021)),

[TABLE]

Let $\mu:=e^{-n(c+2\varepsilon)}/2>0$ and $\hat{X}:=2\mu(\hat{I}-\hat{W}_{n}^{\varepsilon}{}^{\dagger})\hat{\rho}_{n}(\hat{I}-\hat{W}_{n}^{\varepsilon})\geqslant 0$ . From inequality (135) and condition (129c), we have $\mu\hat{\rho}\leqslant\hat{\sigma}+\hat{X}$ , and hence $\mu,\hat{X}$ are valid optimization candidates in the maximization in (131). From Lemma 9 (b) in Appendix 0.A, we have $\operatorname{tr}\bigl{[}{(\hat{I}-\hat{W}_{n}^{\varepsilon\dagger})\hat{\rho}_{n}(\hat{I}-\hat{W}_{n}^{\varepsilon})}\bigr{]}\to 0$ as $n\to\infty$ , and therefore, for sufficiently large $n$ , we have $\operatorname{tr}\bigl{[}{(\hat{I}-\hat{W}_{n}^{\varepsilon\dagger})\hat{\rho}_{n}(\hat{I}-\hat{W}_{n}^{\varepsilon})}\bigr{]}\geqslant\eta/4$ . Therefore, for sufficiently large $n$ ,

[TABLE]

The value attained by the maximization is then

[TABLE]

Dividing by $n$ , taking $n\to\infty$ and then $\varepsilon\to+0$ , we deduce that ${S}_{\mathrm{H}}^{\eta}(\widehat{P}\,\|\,\widehat{\Sigma})\leqslant c$ . ∎

In fact, one can see that the product of two typical projectors constructed in Ref. Bjelakovic and Siegmund-Schultze (2003) for the i.i.d. case satisfies the conditions (129a)–(129d) above, with $c={S}_{1}(\hat{\rho}\,\|\,\hat{\sigma})$ .

IV.2 Formulation of ergodic states and local Gibbs states

In a second step of our main result, we consider ergodic states and local Gibbs states. Here we show that for these states, it is possible to construct an operator that satisfies the conditions in Lemma 5, in turn proving the collapse of the min and max divergences to the KL divergence.

The standard way to rigorously formulate ergodicity invokes infinite-dimensional $C^{\ast}$ -algebras Bratteli and Robinson (1987, 1981); Ruelle (1999). Here, for the sake of broad readability, we introduce the relevant concepts directly in an equivalent — albeit perhaps less elegant — formulation that does not require the use of $C^{\ast}$ algebras. For completeness, we provide the construction based on $C^{\ast}$ algebras in Appendix 0.C.

We consider a spatially $d$ -dimensional system on the lattice $\mathbb{Z}^{d}$ . To each site $i\in\mathbb{Z}^{d}$ , we assign a copy $\mathscr{H}_{i}$ of a finite-dimensional Hilbert space, such that the Hilbert spaces for all sites are isomorphic. We denote the set of operators acting on $\mathscr{H}_{i}$ by $\mathcal{A}_{i}$ . For a bounded region $\Lambda\subset\mathbb{Z}^{d}$ , we define $\mathscr{H}_{\Lambda}:=\bigotimes_{i\in\Lambda}\mathscr{H}_{i}$ and $\mathcal{A}_{\Lambda}:=\bigotimes_{i\in\Lambda}\mathcal{A}_{i}$ . We note that these are finite-dimensional spaces because $\Lambda$ is bounded.

For a bounded region $\Lambda\subset\mathbb{Z}^{d}$ , we consider a density operator $\hat{\rho}_{\Lambda}$ whose support is $\Lambda$ , i.e., $\hat{\rho}_{\Lambda}\in{\mathcal{S}}(\mathscr{H}_{\Lambda})$ . We assume that we are given a collection $\{\hat{\rho}_{\Lambda}\}$ for all bounded subregions of the lattice, which furthermore obey the consistency condition, namely,

[TABLE]

This condition is necessary to ensure that all $\hat{\rho}_{\Lambda}$ are obtained from a common global state defined on the entire infinite lattice (see Appendix 0.C).

Consider now a sequence of bounded regions of the lattice defined as follows. For any $\ell\in\mathbb{N}$ , let $[-\ell,\ell]:=\{-\ell,-\ell+1,\cdots,\ell-1,\ell\}\subset\mathbb{Z}$ and $\Lambda_{\ell}:=[-\ell,\ell]^{d}\subset\mathbb{Z}^{d}$ . We define the sequence of quantum states $\widehat{P}=\{\hat{\rho}_{n}\}$ by $\hat{\rho}_{n}:=\hat{\rho}_{\Lambda_{\ell}}$ , where we set $n:=(2\ell+1)^{d}=|\Lambda_{\ell}|$ . While $n=(2\ell+1)^{d}$ with $\ell=1,2,\cdots$ does not run over all of the elements of $\mathbb{N}$ , it does not affect our following argument; indeed, it is straightforward to complete the sequence with intermediate states for all $n\in\mathbb{N}$ such that the limits that we derive are unaffected.

Before we can formulate ergodicity, we consider the shift superoperator. The shift superoperator $T_{i}$ is defined such that for any local operator $\hat{A}_{j}$ whose support is $j\in\mathbb{Z}^{d}$ , it is mapped by $T_{i}$ to the same operator at site $j+i\in\mathbb{Z}^{d}$ , i.e., $T_{i}(\hat{A}_{j})=\hat{A}_{j+i}$ , where we regard $i\in\mathbb{Z}^{d}$ as a $d$ -dimensional vector with the standard addition for such vectors.

Definition 8 (Translation invariance).

A sequence $\widehat{P}$ of the form above is translation invariant, if it satisfies the consistency condition (138), and for all $n=(2\ell+1)^{d}$ , all $\hat{A}\in\mathcal{A}_{\Lambda}$ with $\Lambda$ being bounded, and all $i\in\mathbb{Z}^{d}$ satisfying $T_{i}(\hat{A})\in\mathcal{A}_{\Lambda_{l}}$ , we have

[TABLE]

We note that “translation invariant” is often referred to as “stationary” in the context of ergodic theory. In our setup, we interpret $i\in\mathbb{Z}^{d}$ as a coordinate of the spatial potition instead of time, and therefore we prefer the denomination “translation invariant.”

Translation invariance is a central ingredient for the definition of ergodicity:

Definition 9 (Ergodicity).

A sequence $\widehat{P}$ is translation-invariant and ergodic, if it is translation invariant, and for all self-adjoint $\hat{A}\in\mathcal{A}_{\Lambda}$ for a bounded region $\Lambda$ we have

[TABLE]

where $n=(2\ell+1)^{d}$ on the left-hand side is taken such that $T_{i}(\hat{A})\in\mathcal{A}_{\Lambda_{\ell}}$ for all $i\in\Lambda_{m}$ .

The limit on the right-hand side of (140) is not actually necessary, because the consistency condition (138) implies that $\operatorname{tr}[\hat{\rho}_{n}\hat{A}]$ does not depend on $n$ for large $n=(2\ell+1)^{d}$ satisfying $\Lambda\subset\Lambda_{\ell}$ . The equivalence of this definition and the standard definition is proved in Appendix 0.C.

This definition implies that the variance of the shift average (i.e., the spatial average) of any local observable vanishes in the thermodynamic limit. We emphasize that an ergodic state can be out of equilibrium, because ergodicity is defined with respect to the spatial shift instead of time evolution.

We now define the Hamiltonian of the system which determines the Gibbs state. Let $\hat{h}_{i}$ be a local operator describing interaction, whose support is a bounded region around site $i\in\mathbb{Z}^{d}$ . More precisely, we assume that the support of $\hat{h}_{i}$ is in $\{j:|j_{k}-i_{k}|\leqslant r,\ \forall k\}\subset\mathbb{Z}^{d}$ , where $0\leqslant r<\infty$ is an integer and $i_{k}$ , $j_{k}$ describe the $k$ -th components of $i,j\in\mathbb{Z}^{d}$ ( $k=1,\cdots,d$ ). We note that $r$ represents the interaction length, where $r=0$ describes non-interacting cases.

Then, for a bounded region $\Lambda\subset\mathbb{Z}^{d}$ , the truncated Hamiltonian is given by

[TABLE]

A Hamiltonian of this form is referred to as a local Hamiltonian. The Hamiltonian is translation invariant, if it can be written in the form

[TABLE]

for some fixed operator $h_{0}$ .

Let $\beta>0$ be the inverse temperature. The truncated Gibbs state on a bounded region $\Lambda$ is given by the density operator

[TABLE]

where $F_{\Lambda}:=-\beta^{-1}\ln\operatorname{tr}[\exp(-\beta\hat{H}_{\Lambda})]$ is the truncated free energy. We note that $\hat{\sigma}^{\Box}_{\Lambda}$ does not satisfy the consistency condition (138), because of the effects on the edges of the region $\Lambda$ where we have truncated the Hamiltonian.

We consider a sequence of the truncated Gibbs states: We define $\widehat{\Sigma}^{\Box}:=\{\hat{\sigma}^{\Box}_{n}\}$ with $\hat{\sigma}^{\Box}_{n}:=\hat{\sigma}^{\Box}_{\Lambda_{m}}$ , where $n:=(2\ell+1)^{d}$ and $m:=\ell-r$ . We note that, with this definition, the supports of $\hat{\sigma}^{\Box}_{n}$ and $\hat{\rho}_{n}$ are the same. In the following we use the shorthands $\hat{H}_{n}:=\hat{H}_{\Lambda_{m}}$ and $F_{n}:=F_{\Lambda_{m}}$ .

IV.3 Generalized Stein’s lemma for ergodic states relative to

local Gibbs states

We now consider a proof of a generalization of the quantum Stein’s lemma for ergodic states relative to local Gibbs states. We begin by proving that the limiting KL divergence is well defined:

Lemma 6.

Suppose that $\widehat{P}$ is translation invariant and $\widehat{\Sigma}^{\Box}$ is the truncated Gibbs state of a local and translation-invariant Hamiltonian in any dimensions. Then $S_{1}(\widehat{P}\|\widehat{\Sigma}^{\Box})$ exists.

Proof.

This follows from the following well-known facts. From Eq. 143,

[TABLE]

The first term on the right-hand side converges to ${S}_{1}(\widehat{P})$ because $\widehat{P}$ is translation invariant (Proposition 6.2.38 of Ref. Bratteli and Robinson (1981)). It is also known that the second term converges to the free energy density (Theorem 6.2.40 of Ref. Bratteli and Robinson (1981)). The third term also converges, because $\hat{H}_{\Lambda}$ is local and translation invariant, and $\widehat{P}$ is translation invariant. ∎

One important ingredient in the proof of our generalization of the quantum Stein’s lemma is the following typical projector for ergodic states (Theorem 2.1 of Ref. Bjelaković et al. (2004); see also Theorem 5.1 of Ref. Bjelaković and Szkola (2005) and Theorem 1.4 of Ref. Ogata (2013)).

Proposition 8 (Quantum Shannon-McMillan Theorem).

Suppose that $\widehat{P}$ is ergodic. Then for any $\varepsilon>0$ there exists a sequence of projectors $\hat{\Pi}_{\widehat{P},n}^{\varepsilon}$ (called typical projectors) that satisfy, for sufficiently large $n$ ,

[TABLE]

where $s:={S}_{1}(\widehat{P})$ .

We now consider our main theorem for ergodic states and for the truncated Gibbs state.

Theorem 3 (Collapse of the spectral rates for the truncated Gibbs state).

Consider a lattice $\mathbb{Z}^{d}$ of spatial dimension $d$ and suppose that $\widehat{P}$ is translation invariant and ergodic, as in Section IV.2. Let $\widehat{\Sigma}^{\Box}$ be the sequence of truncated Gibbs states of a local and translation invariant Hamiltonian on the lattice. Then, for any $0<\eta<1$ ,

[TABLE]

and as a consequence,

[TABLE]

Proof.

From the proof of Lemma 6, the following limit exists,

[TABLE]

Let $s:=S_{1}(\widehat{P})$ . We define relative typical projectors (as inspired by Refs. Bjelakovic and Siegmund-Schultze (2004, 2003)) as

[TABLE]

which satisfy by definition

[TABLE]

We then define

[TABLE]

The remainder of the proof is devoted to showing that the operator $\hat{W}_{n}^{\varepsilon}$ satisfies the four conditions (129a)–(129d) in Lemma 5 with

[TABLE]

These conditions then immediately imply Eq. 148, as discussed in Section IV.1.

The condition (129a) is clear by definition. Condition (129b) is obtained from inequalities (146) and (152) as

[TABLE]

The third condition (129c) is obtained from inequalities (145) and (152) as

[TABLE]

The final condition (129d) follows from Lemma 8 in Appendix 0.A, Eq. 147 in Proposition 8, and from

[TABLE]

To show Eq. 157, we use the assumption of ergodicity of $\widehat{P}$ . Since the Hamiltonian is local and translation invariant, we have

[TABLE]

where $T_{i}$ is the shift operator. Then, denoting by $\operatorname{Proj}\{\cdots\}$ the projection operator onto a subspace satisfying the corresponding condition, we have

[TABLE]

where $h:=\lim_{n\to\infty}\frac{1}{n}\operatorname{tr}\bigl{[}\hat{\rho}_{n}\hat{H}_{n}\bigr{]}$ and $f:=\lim_{n\to\infty}\frac{F_{n}}{n}$ . For sufficiently large $n$ , we have $\lvert{\frac{F_{n}}{n}-f}\rvert<\frac{\varepsilon}{2}$ , and therefore,

[TABLE]

By definition of ergodicity, observables of the form (158) converge in probability; we have proven Eq. 157. ∎

The above proof reduces to the main theorem of Ref. Bjelakovic and Siegmund-Schultze (2004) in the special case where $\widehat{\Sigma}^{\Box}$ is i.i.d., i.e., if the system has a strictly local Hamiltonian with no interaction terms ( $r=0$ ).

Finally, we can ask whether the same theorem holds also for the sequence $\widehat{\Sigma}$ of reduced states of the full Gibbs state on the infinite lattice. We show that this is indeed the case. Because this theorem requires a rigorous formulation in terms of $C^{*}$ -algebras, we defer the precise claim and proof to Theorem 4 in Appendix 0.C .

IV.4 Remarks on ergodicity, mixtures, and the

KL divergence

IV.4.1 The mixing property

A local Gibbs state with a mixing (or clustering) property is ergodic. However, we emphasize that the converse is false; ergodicity does not necessarily imply that the state can be written as a Gibbs state of a local Hamiltonian.

Definition 10 (Mixing).

Let $T_{(k)}:=T_{(0,\cdots,0,1,0,\cdots,0)}$ be the shift operator corresponding to the one-step shift to the $k$ -th direction ( $k=1,2,\cdots,d$ ). A sequence $\widehat{P}$ has the mixing (or clustering) property, if it satisfies the consistency condition (138), and if for all $\hat{A},\hat{B}\in\mathcal{A}_{\Lambda}$ with $\Lambda$ being bounded and if for all $k$ , we have

[TABLE]

where $n=(2\ell+1)$ on the left-hand side is taken such that the supports of $T_{(k)}^{m}(\hat{A})$ and $\hat{B}$ are included in $\Lambda_{\ell}$ .

The equivalence of this definition and the standard definition is proven in Appendix 0.C. It is well-known that mixing implies ergodicity (cf. Ref. Ruelle (1999)):

Proposition 9.

Any translation-invariant and mixing state is ergodic.

For local operators and the Gibbs state of a local and translation-invariant Hamiltonian, a stronger property called the exponential clustering property has been proven for any $\beta>0$ in one dimension Araki (1969) and in higher dimensions for sufficiently high temperature (see, for example, Ref. Tasaki (2018) and references therein). Therefore, the quantum Stein’s lemma is proved for two local Gibbs states $\widehat{P}$ and $\widehat{\Sigma}$ at least for sufficiently high temperature.

IV.4.2 Mixtures of ergodic states

Consider now the situation in which the state is a mixture of different ergodic states. In this setting, ergodicity is broken, and the existence of a thermodynamic potential is no longer guaranteed.

Let $\widehat{P}^{(k)}:=\{\hat{\rho}_{n}^{(k)}\}$ be ergodic states ( $k=1,2,\cdots,K<\infty$ ), and consider their mixture $\widehat{P}:=\{\hat{\rho}_{n}\}$ with $\hat{\rho}_{n}:=\sum_{k}r_{k}\hat{\rho}_{n}^{(k)}$ , where $r_{k}>0$ and $\sum_{k}r_{k}=1$ . We continue to suppose that $\widehat{\Sigma}$ is given by the Gibbs state of a local and translation-invariant Hamiltonian. In this setting, we can show that the min and max divergences are given by the minimal and maximal value of the KL divergence of the states in the mixture, respectively:

Lemma 7.

The spectral divergence rates are split as

[TABLE]

while the KL divergence rate is given by

[TABLE]

Proof.

Equation 162 immediately follows from Proposition 11 in Appendix 0.A. To prove (163), we note that $-\operatorname{tr}\bigl{[}\hat{\rho}^{(k)}_{n}\ln\hat{\sigma}_{n}\bigr{]}$ is additive with respect to $k$ , and thus we only need to show ${S}_{1}(\widehat{P})=\sum_{k}r_{k}{S}_{1}(\widehat{P}^{(k)})$ . This in turn follows from the fact that the von Neumann entropy satisfies the following inequalities, $\sum_{k}r_{k}{S}_{1}(\hat{\rho}_{n}^{(k)})\leqslant{S}_{1}(\hat{\rho}_{n})\leqslant\sum_{k}r_{k}{S}_{1}(\hat{\rho}_{n}^{(k)})+{S}_{1}(\{r_{k}\})$ . ∎

IV.4.3 The role of the KL divergence for the thermodynamic

potential

Usually, we have that if the min and max divergences coincide, then the limiting values coincide with the limiting value of the KL divergence. This is because in usual cases, the asymptotic divergences obey

[TABLE]

Indeed, this inequality follows in usual cases from the fact that ${S}_{0}(\hat{\rho}\,\|\,\hat{\sigma})\leqslant{S}_{1}(\hat{\rho}\,\|\,\hat{\sigma})\leqslant{S}_{\infty}(\hat{\rho}\,\|\,\hat{\sigma})$ combined with a continuity argument of the KL divergence in $\hat{\rho}$ which ensures the inequality persists after smoothing with $\varepsilon>0$ . Indeed, for $D(\hat{\rho}^{\prime},\hat{\rho})\leqslant\varepsilon$ , we have $\lvert{{S}_{1}(\hat{\rho}\,\|\,\hat{\sigma})-{S}_{1}(\hat{\rho}^{\prime}\,\|\,\hat{\sigma})}\rvert\leqslant\lvert{{S}_{1}(\hat{\rho})-{S}_{1}(\hat{\rho}^{\prime})}\rvert+2\varepsilon\lVert{\ln\hat{\sigma}}\rVert_{\infty}$ , where the first term can be bounded using the Fannes-Audenaert inequality Fannes (1973); Audenaert (2007) and where the second term behaves as $O(\varepsilon n)$ as long as $\lVert{\ln\hat{\sigma}}\rVert_{\infty}$ is at most linear in $n$ . In this case, $\frac{1}{n}{S}_{0}^{\varepsilon}(\hat{\rho}_{n}\,\|\,\hat{\sigma}_{n})\leqslant\frac{1}{n}{S}_{1}(\hat{\rho}_{n}\,\|\,\hat{\sigma}_{n})+O(\varepsilon)$ and $\frac{1}{n}{S}_{1}(\hat{\rho}_{n}\,\|\,\hat{\sigma}_{n})-O(\varepsilon)\leqslant\frac{1}{n}{S}_{\infty}^{\varepsilon}(\hat{\rho}_{n}\,\|\,\hat{\sigma}_{n})$ , which ensures that (164) holds. Notably, while this is the case in most usual settings such as the one considered in the present paper, this continuity argument does not hold in general for arbitrary sequences of states and operators.

As a simple toy example, consider a two-level system with states $\lvert{0}\rangle,\lvert{1}\rangle$ , fix an inverse temperature $\beta>0$ , and let $\{\varepsilon_{n}\}$ be a sequence of small positive nonzero reals with $\lim_{n\to\infty}\varepsilon_{n}=0$ . We consider the sequence of states $\widehat{P}$ with $\hat{\rho}_{n}=\varepsilon_{n}\lvert{1}\rangle\langle{1}\rvert+(1-\varepsilon_{n})\lvert{0}\rangle\langle{0}\rvert$ and a sequence of Hamiltonians $\widehat{\mathcal{H}}$ with $\hat{H}_{n}=(n/\varepsilon_{n})\lvert{1}\rangle\langle{1}\rvert$ . (The sequence is defined on a single copy of the Hilbert space; it is straightforward to embed these operators in $\mathscr{H}^{\otimes n}$ , though perhaps not in a local and translation-invariant way.) The corresponding sequence $\widehat{\Sigma}$ of Gibbs weights is $\hat{\sigma}_{n}=e^{-\beta\hat{H}_{n}}=e^{-(\beta n/\varepsilon_{n})}\lvert{1}\rangle\langle{1}\rvert+(\hat{I}-\lvert{1}\rangle\langle{1}\rvert)$ . We can calculate

[TABLE]

For the min divergence and for any $\varepsilon>0$ we have

[TABLE]

and hence ${\underline{S}}(\widehat{P}\,\|\,\widehat{\Sigma})\geqslant 0$ . On the other hand, for any $\varepsilon>0$ we have that $\varepsilon_{n}\leqslant\varepsilon$ for $n$ large enough; then for $n$ large enough, $D\bigl{(}\lvert{0}\rangle\langle{0}\rvert,\hat{\rho}_{n}\bigr{)}\leqslant\varepsilon$ and

[TABLE]

and hence ${\overline{S}}(\widehat{P}\,\|\,\widehat{\Sigma})\leqslant 0$ . Finally, recalling (30), we find

[TABLE]

Crucially, the operator $\hat{\sigma}_{n}$ has an eigenvalue that is at least exponentially small in $n$ , and $\lVert{\ln\hat{\sigma}}\rVert_{\infty}$ is superlinear in $n$ . This invalidates the usual continuity argument described above. Having $\lVert{\ln\hat{\sigma}}\rVert_{\infty}$ with such a behavior amounts to having a Hamiltonian (such as $\hat{H}_{n}$ in our example) with an energy level that scales superlinearly in $n$ . Physically, this means that the system does not have a sound thermodynamic limit; in practice, for instance in the case of all-to-all coupling, one prefers to normalize the full Hamiltonian to ensure a good behavior in the thermodynamic limit. Nevertheless, our toy example shows that in full generality, the min- and max-divergences can collapse to a single value and define a thermodynamic potential which does not coincide with the KL divergence in the thermodynamic limit.

We emphasize that this issue does not appear in usual settings such as the one considered in the present paper, where the energy is extensive. Also, this issue cannot appear with the spectral entropy rates (i.e., if $\hat{\sigma}=\hat{I}$ ), because of the argument above, or alternatively, thanks to Lemma 3 of Ref. Bowen and Datta (2006b). In those cases, the Kullback-Leibler divergence (or the von Neumann entropy rate) is the relevant thermodynamic potential that emerges from the reversibility of the resource theory.

V Discussion

Our results provide new insight on the role of ergodicity and typicality in many-body systems Anshu (2016); Wilming et al. (2019). Our two main theorems on one hand advance our understanding of the possible interconversion of states with thermal operations and a limited source of coherence, and on the other hand establish a generalized quantum Stein’s lemma for lattice systems with local and translation-invariant Hamiltonians. Together, these theorems prove our main result, namely, that a thermodynamic potential emerges in the resource theory of thermal operations for all ergodic states in lattices with a translation-invariant local Hamiltonian.

Thermal operations involving nonsemiclassical states.

While the possible state transformations under thermal operations are well understood for semiclassical states thanks to the notion of thermomajorization Horodecki and Oppenheim (2013), the picture becomes significantly more involved if we consider states that present coherences between energy eigenspaces Gour et al. (2018); Lostaglio et al. (2015a). The min- and max-divergence no longer represent the distillable work and the work cost of formation of a state, because in general one requires a suitable reference frame to accurately carry out those transformations Bartlett et al. (2007); Korzekwa et al. (2016); Gour et al. (2018); Popescu et al. (2018). Our Theorem 2 shows, however, that if the two divergences coincide approximately, then the coherences that are present in the state are necessarily small in a suitable sense, such that these transformations become approximately possible after all with only a small reference frame. In the thermodynamic limit, the size of the reference frame becomes negligible.

Our theorem provides a conceptually clear characterization of which states can be reversibly converted to the thermal state, and hence, for which class of states the thermodynamic potential emerges. Namely, approximately reversible conversion to the thermal state is possible if and only if the min and max divergences coincide approximately (although the error terms have to be adjusted in each direction of the proof).

We resort to a crude metric for the amount of coherence that was used in a process: We allow the use of an ancilla whose Hamiltonian is suitably bounded. Recently, more refined methods of accounting for coherence have been introduced, such as via coherent work Mingo and Jennings (2019) or with a more traditional resource-theoretic approach Marvian (2020). Using an improved measure of coherence would allow to clarify the amount of coherence used in the processes of Theorem 2.

One could ask for a characterization of which classes of states can be reversibly converted into one another, without being necessarily reversibly convertible to the thermal state. Consider for instance two states with the same spectrum that is not uniform, both living within a fixed energy subspace: They can be related by an energy-conserving unitary, but they cannot be reversibly converted to the thermal state. In this paper, we have adopted the convention that a thermodynamic potential should be well defined for the thermal state itself. Curiously however, it is also possible to define some kind of “alternative thermodynamic potentials” for such classes of states which cannot include the thermal state. It is not clear to us what the physical relevance of such classes of states would be.

We also note that ergodic states have off-diagonal elements that vanish exponentially, similarly to the behavior encountered in states obeying the eigenstate thermalization hypothesis (Lemma 4 combined with Theorem 3). It is then natural to ask whether there are properties of states that obey the eigenstate thermalization hypothesis (such as error-correcting properties Brandão et al. (2019)) that can be carried over to ergodic states.

Asymptotic Equipartition, the Shannon-McMillan theorem, and Stein’s lemma

The classical Shannon-McMillan theorem along with its quantum counterparts provide a collection of AEP statements that play an important role in information theory, statistics, and statistical physics, where ergodic processes are naturally encountered. Because of the stark formal differences between the quantum and the classical definitions of Markovianity, the quantum versions of these AEP theorems do not follow directly from their classical counterparts. Building on earlier proofs of the quantum Shannon-McMillan theorem Bjelaković et al. (2004); Bjelaković and Szkola (2005); Ogata (2013) and a relative AEP theorem with respect to product states Bjelakovic and Siegmund-Schultze (2004), we finally provide the full quantum version of the classical relative AEP theorem mentioned above, which applies to ergodic states relative to Gibbs states of a local Hamiltonian.

A main component of the proof of our main result is a generalized version of Stein’s lemma which is tightly related to the proof techniques of Ref. Bjelakovic and Siegmund-Schultze (2003). Namely, it suffices to find an operator obeying a set of simple conditions to conclude that the min and max divergences collapse, which can be seen partly thanks to ideas from semidefinite programming Dupuis et al. (2013); Tomamichel and Hayashi (2013). By constructing suitable typical projectors using the ergodicity property of the state, our Theorem 3 exploits this characterization and provides a new version of Stein’s lemma. The latter applies to situations beyond i.i.d. states, since we may consider any ergodic state with respect to any Gibbs state that arises from a local Hamiltonian.

Crucially, the states we consider are spatially ergodic, rather than ergodic with respect to time evolution. Spatially ergodic states can have a nontrivial time evolution, even producing significant changes of macroscopic quantities in time Faist et al. (2019a). Importantly, this shows that one can define a thermodynamic potential that has a operational interpretation even for certain states that are not in thermodynamic equilibrium.

By endowing a new class of states with a rigorous, well-justified thermodynamic potential, one may ask whether or not it is possible to find even larger classes of states that can be reversibly converted into one another. Thanks to Lemma 7, the thermodynamic potential also emerges for all finite mixtures of ergodic states with the same thermodynamic potential. Whether there are more translation-invariant states that have a well-defined thermodynamic potential in the sense of the present paper is an open question.

One may ask whether or not our results could be generalized to systems that violate translation-invariance. It might be possible to treat a weak violation by adapting the present argument with a suitable control of the relevant error terms. For systems that are fundamentally not translation-invariant, one could instead ask whether ideas from entropy accumulation could be leveraged to prove bounds on the min and max spectral rates in the thermodynamic limit, using local properties of the state (or of the local process that generates the state) rather than symmetry considerations Dupuis et al. (2020); Dupuis and Fawzi (2019). Conversely, insights gained from the behavior of the spectral rates in statistical mechanical systems might provide new ways of proving more general entropy accumulation theorems which might involve the divergence, the mutual information, or a channel capacity.

A further natural extension of our work would be to lift our results from transformations of quantum states to transformations of quantum channels, in line of the results of Ref. Faist et al. (2019b). Can non-i.i.d. quantum channels that have a suitable ergodic property be reversibly converted into one another?

The quantum Shannon-McMillan theorem moreover holds in a more general and abstract operator algebra context Ogata (2013). We might expect that additional AEP results in such settings can be shown using ideas put forward in the present paper.

Finally, one could attempt to further characterize the min and max divergence rates in natural situations where they do not coincide. These quantities are known to bound any extension of the thermodynamic potential outside of the set of reversibly interconvertible states Lieb and Yngvason (2013), and as such, the interval $[{\underline{S}}(\widehat{P}\,\|\,\widehat{\Sigma}),{\overline{S}}(\widehat{P}\,\|\,\widehat{\Sigma})]$ provides a “best possible characterization” of the thermodynamic behavior of such states that takes into account the fluctuations in thermodynamic quantities that persist in the thermodynamic limit. We expect this to be the case, for instance, for many-body-localized states, or for states at critical points immediately before spontaneous symmetry breaking. The techniques put forward in the present paper might help derive bounds in such cases, which, while falling short of a collapse of the min and max divergences, would still provide a useful characterization for a greater class of states that are far out of equilibrium.

Acknowledgements.

The authors are grateful to Hiroyasu Tajima, Yoshiko Ogata and Matteo Lostaglio for valuable discussions. TS is supported by JSPS KAKENHI Grant Number JP16H02211 and JP19H05796. PhF is supported by the Institute for Quantum Information and Matter (IQIM) at Caltech which is a National Science Foundation (NSF) Physics Frontiers Center (NSF Grant PHY-1733907), from the Department of Energy Award DE-SC0018407, from the Swiss National Science Foundation (SNSF) via the NCCR QSIT and project No. 200020_165843, and from the Deutsche Forschungsgemeinschaft (DFG) Research Unit FOR 2724. KK is supported by the Institute for Quantum Information and Matter (IQIM) at Caltech which is a National Science Foundation (NSF) Physics Frontiers Center (NSF Grant PHY-1733907). FB is is supported by the NSF.

Appendix 0.A General technical lemmas

The following gentle measurement lemma states that a measurement effect that is almost certain to appear does not disturb the state much Winter (1999); Ogawa and Nagaoka (2007).

Proposition 10.

For a state $\hat{\rho}$ and any operator with $0\leqslant\hat{Q}\leqslant\hat{I}$ , if $\operatorname{tr}[\hat{\rho}\hat{Q}]\geqslant 1-\varepsilon$ , then

[TABLE]

The following technical lemmas provide a few variations around the gentle measurement lemma, dealing with operators that capture most of the weight of a state.

Lemma 8.

Let $\hat{Q}$ and $\hat{Q}^{\prime}$ be projectors. Suppose that a state $\hat{\rho}$ satisfies $\operatorname{tr}\bigl{[}\hat{Q}\hat{\rho}\bigr{]}\geqslant 1-\varepsilon$ and $\operatorname{tr}\bigl{[}\hat{Q}^{\prime}\hat{\rho}\bigr{]}\geqslant 1-\varepsilon^{\prime}$ for $\varepsilon>0$ , $\varepsilon^{\prime}>0$ . Then,

[TABLE]

Proof.

We first note that

[TABLE]

From the Schwarz inequality, we have

[TABLE]

where we used that $\hat{Q}$ and $\hat{I}-\hat{Q}^{\prime}$ are projectors. Therefore, we obtain Eq. 170. ∎

Lemma 9.

Let $\hat{W}$ be an operator with $\lVert{\hat{W}}\rVert_{\infty}\leqslant 1$ . Suppose that a subnormalized state $\hat{\rho}\in{\mathcal{S}_{\leq}}(\mathscr{H})$ satisfies $\operatorname{Re}\bigl{(}\operatorname{tr}\bigl{[}\hat{W}\hat{\rho}\bigr{]}\bigr{)}\geqslant 1-\varepsilon$ with $\varepsilon>0$ . Then, both following statements are true:

(a)

$\operatorname{tr}\bigl{[}\hat{W}^{\dagger}\hat{W}\hat{\rho}\bigr{]}\geqslant 1-2\varepsilon$ * and $\operatorname{tr}\bigl{[}\hat{W}\hat{W}^{\dagger}\hat{\rho}\bigr{]}\geqslant 1-2\varepsilon$ ;* 2. (b)

$\operatorname{tr}\bigl{[}(\hat{I}-\hat{W})(\hat{I}-\hat{W}^{\dagger})\hat{\rho}\bigr{]}\leqslant 2\varepsilon$ * .*

Proof.

(a)

From the Cauchy-Schwarz inequality,

[TABLE]

We can show the second inequality in the same manner. 2. (b)

This follows from

[TABLE]

where we used $\lVert{\hat{W}}\rVert_{\infty}\leqslant 1$ . ∎

∎

Next we show that for a mixture of states, the min and max spectral rates are given by the smallest or largest spectral rate in the mixture, respectively.

Proposition 11.

Consider a sequence of states $\widehat{P}:=\{\hat{\rho}_{n}\}$ where each state is given by a mixture $\hat{\rho}_{n}=\sum_{k=1}^{K}r_{k}\hat{\rho}_{n}^{(k)}$ for a given probability distribution $\{r_{k}\}_{k=1}^{K}$ independent of $n$ , and consider the individual sequences $\widehat{P}^{(k)}=\{\hat{\rho}_{n}^{(k)}\}$ . Then, the lower and the upper divergence rates of $\widehat{P}$ relative to a sequence of positive operators $\widehat{\Sigma}$ are given by

[TABLE]

This proposition immediately follows from the following three lemmas.

Lemma 10.

Consider a mixture of states $\hat{\rho}=\sum r_{k}\hat{\rho}^{(k)}$ with a probability distribution $\{r_{k}\}$ . Let $\hat{\tau}$ be a quantum state such that $F^{2}(\hat{\rho},\hat{\tau})\geqslant 1-\varepsilon^{2}$ . Then there exists a probability distribution $\{r_{k}^{\prime}\}$ and a collection of states $\hat{\tau}^{(k)}{}^{\prime}$ such that

[TABLE]

Proof.

Call our system of interest $A$ , and consider a copy $B\simeq A$ . Let $\{\lvert{j}\rangle_{A}\},\{\lvert{j}\rangle_{B}\}$ be orthonormal bases of $A$ and $B$ , respectively, and let $\lvert{\Phi}\rangle:=\sum_{j}\lvert{j}\rangle_{A}\lvert{j}\rangle_{B}$ be the reference unnormalized maximally entangled state. Consider the following purification of $\hat{\rho}^{(k)}$ ,

[TABLE]

Let $C$ be a register with an orthonormal basis $\{\lvert{k}\rangle_{C}\}$ and consider the following purification of $\hat{\rho}_{C}$ ,

[TABLE]

From Uhlmann’s theorem, there exists a purification $\lvert{\hat{\tau}}\rangle_{ABC}$ of $\hat{\tau}_{A}$ such that

[TABLE]

Invoking the Fuchs-van de Graaf relations between the fidelity and the trace distance Fuchs and van de Graaf (1999); Nielsen and Chuang (2000), $1-F(\cdot,\cdot)\leqslant D(\cdot,\cdot)\leqslant\sqrt{1-F^{2}(\cdot,\cdot)}$ , we find that $D(\lvert{\hat{\rho}}\rangle,\lvert{\hat{\tau}}\rangle)\leqslant\varepsilon$ . Now, define

[TABLE]

From the monotonicity of the trace norm under CPTP maps, we have $D(\{r_{k}\},\{r^{\prime}_{k}\})\leqslant\varepsilon$ , where here the trace distance is calculated between the two classical probability distributions, which is known as the total variational distance. Furthermore, the trace norm cannot increase under any CP and trace-nonincreasing maps, and hence,

[TABLE]

This implies

[TABLE]

which completes the proof. ∎

Lemma 11.

Consider a mixture of states $\hat{\rho}=\sum_{k=1}^{K}r_{k}\hat{\rho}^{(k)}$ with a probability distribution $\{r_{k}\}$ . Let $\varepsilon>0$ be such that $2\sqrt{\varepsilon}<r_{k}$ for all $k$ . Then

[TABLE]

Proof.

We first show inequality (183). For each $k$ , there exists $\hat{\tau}^{(k)}\in B^{\varepsilon}(\hat{\rho}^{(k)})$ such that ${S}_{\infty}^{\varepsilon}(\hat{\rho}^{(k)}\,\|\,\hat{\sigma})={S}_{\infty}(\hat{\tau}^{(k)}\,\|\,\hat{\sigma})$ . Let $\hat{\tau}:=\sum_{k}r_{k}\hat{\tau}^{(k)}$ , which is a candidate for minimization in ${S}_{\infty}^{\varepsilon}(\hat{\rho}\,\|\,\hat{\sigma})$ , because $D(\hat{\tau},\hat{\rho})\leqslant\sum_{k}r_{k}D(\hat{\tau}^{(k)},\hat{\rho}^{(k)})\leqslant\varepsilon$ , using the joint convexity of the trace distance. Then,

[TABLE]

We next show inequality (184). There exists $\hat{\tau}\in B^{\varepsilon}(\hat{\rho})$ such that ${S}_{\infty}^{\varepsilon}(\hat{\rho}\,\|\,\hat{\sigma})={S}_{\infty}(\hat{\tau}\,\|\,\hat{\sigma})$ . By the Fuchs-van de Graaf inequalities Fuchs and van de Graaf (1999); Nielsen and Chuang (2000), we have $F(\hat{\rho},\hat{\tau})\geqslant 1-D(\hat{\rho},\hat{\tau})$ and thus $F^{2}(\hat{\rho},\hat{\tau})\geqslant 1-2\varepsilon$ . Let $\{\hat{\tau}^{(k)}\}$ be quantum states and $\{r^{\prime}_{k}\}$ be a probability distribution that are given by Lemma 10, such that $D(\{r_{k}\},\{r^{\prime}_{k}\})\leqslant\sqrt{2\varepsilon}$ and $D(\hat{\rho}^{(k)},\hat{\tau}^{(k)})\leqslant 2\sqrt{2\varepsilon}/r_{k}$ . Noting that $r_{k}^{\prime}\hat{\tau}^{(k)}\leqslant\hat{\tau}$ , we have

[TABLE]

which implies inequality (184). ∎

Lemma 12.

Consider a mixture of states $\hat{\rho}=\sum_{k=1}^{K}r_{k}\hat{\rho}^{(k)}$ with a probability distribution $\{r_{k}\}$ , and let $\varepsilon>0$ . Then

[TABLE]

Proof.

We first show inequality (187). For each $k$ , there exists $\hat{\tau}^{(k)}\in B^{\varepsilon}(\hat{\rho}^{(k)})$ such that ${S}_{0}^{\varepsilon}(\hat{\rho}^{(k)}\,\|\,\hat{\sigma})={S}_{0}(\hat{\tau}^{(k)}\,\|\,\hat{\sigma})$ . Let $\hat{\tau}:=\sum_{k}r_{k}\hat{\tau}^{(k)}$ , which is a candidate for maximization in ${S}_{0}^{\varepsilon}(\hat{\rho}\,\|\,\hat{\sigma})$ . We note that $\hat{P}_{\hat{\tau}}\leqslant\sum_{k}\hat{P}_{\hat{\tau}^{(k)}}$ , because the kernel of $\hat{\tau}$ is larger than the intersection of the kernels of $\hat{\tau}^{(k)}$ ’s. Therefore,

[TABLE]

We next show inequality (188). There exists $\hat{\tau}\in B^{\varepsilon}(\hat{\rho})$ such that ${S}_{0}^{\varepsilon}(\hat{\rho}\,\|\,\hat{\sigma})={S}_{0}(\hat{\tau}\,\|\,\hat{\sigma})$ . By the Fuchs-van de Graaf inequalities, we have $F^{2}(\hat{\rho},\hat{\tau})\geqslant 1-2\varepsilon$ as above. Let $\{\hat{\tau}^{(k)}\}$ be states and $\{r^{\prime}_{k}\}$ be a probability distribution given by Lemma 10. For all $k$ ,

[TABLE]

and therefore

[TABLE]

which implies inequality (188). ∎

Appendix 0.B Properties of our thermodynamic framework and convertibility proof for Gibbs-preserving maps

In this section we derive a collection of useful properties of thermodynamic transformations that were introduced in Section III.1, and provide a simplified version of Theorem 1 that is specialized to Gibbs-preserving maps.

The partial isometry in the definition of a thermal operation commutes with the system-and-bath Hamiltonian in the following sense.

Proposition 12.

Let $K,L$ be systems with Hamiltonians $\hat{H}_{K},\hat{H}_{L}$ and let $\hat{V}_{K\to L}$ be a partial isometry such that $\hat{V}_{K\to L}\hat{H}_{K}=\hat{H}_{L}\hat{V}_{K\to L}$ . Then

[TABLE]

In consequence, $\hat{V}_{K\to L}$ is a mapping of a subset of initial energy eigenstates on $K$ to some final energy eigenstates on $L$ .

Proof.

We compute directly $[\hat{V}^{\dagger}_{K\leftarrow L}\hat{V}_{K\to L},\hat{H}_{K}]=\hat{V}^{\dagger}_{K\leftarrow L}(\hat{V}_{K\to{}L}\hat{H}_{K}-\hat{H}_{L}\hat{V}_{K\to{}L})+(\hat{V}^{\dagger}_{K\leftarrow{}L}\hat{H}_{L}-\hat{H}_{K}\hat{V}^{\dagger}_{K\leftarrow{}L})\hat{V}_{K\to L}=0$ and similarly $[\hat{V}_{K\to L}\hat{V}^{\dagger}_{K\leftarrow L},\hat{H}_{L}]=0$ . ∎

Now we show that any partial isometry that is compatible with the system Hamiltonian (i.e., one that maps the input Hamiltonian to the output Hamiltonian on the range of the partial isometry) can be dilated into a full energy-conserving unitary on a larger system from which the partial isometry is recovered by preparing an ancilla in a pure state and post-selecting on a specific measurement outcome of an ancilla on the output of the unitary. The present proof is partly adapted from (Faist et al., 2021, Proposition C.2).

Proposition 13 (Dilation of a partial energy-conserving isometry).

Consider systems $K,L$ with Hamiltonians $\hat{H}_{K},\hat{H}_{L}$ . Let $\hat{V}_{K\to L}$ be a partial isometry such that $\hat{V}_{K\to L}\hat{H}_{K}=\hat{H}_{L}\hat{V}_{K\to L}$ . Let $M$ be a system with Hamiltonian $\hat{H}_{M}$ , and suppose that there exist nontrivial systems $\bar{K}$ and $\bar{L}$ with respective Hamiltonians $\hat{H}_{\bar{K}}$ , $\hat{H}_{\bar{L}}$ along with unitaries $\hat{U}^{\prime}_{K\bar{K}\to M}$ , $\hat{U}^{\prime}_{L\bar{L}\to M}$ such that $\hat{U}^{\prime}_{K\bar{K}\to M}(\hat{H}_{K}+\hat{H}_{\bar{K}})=\hat{H}_{M}\hat{U}^{\prime}_{K\bar{K}\to M}$ and $\hat{U}^{\prime\prime}_{L\bar{L}\to M}(\hat{H}_{L}+\hat{H}_{\bar{L}})=\hat{H}_{M}\hat{U}^{\prime\prime}_{L\bar{L}\to M}$ . Let $\lvert{\mathrm{i}}\rangle_{\bar{K}},\lvert{\mathrm{f}}\rangle_{\bar{L}}$ be two eigenstates of $\hat{H}_{\bar{K}}$ and $\hat{H}_{\bar{L}}$ of the same energy $e_{\mathrm{i}}=\langle{\mathrm{i}}\hskip 0.86108pt|\hskip 0.86108pt{\hat{H}_{\bar{K}}}\hskip 0.86108pt|\hskip 0.86108pt{\mathrm{i}}\rangle_{K}=e_{\mathrm{f}}=\langle{\mathrm{f}}\hskip 0.86108pt|\hskip 0.86108pt{\hat{H}_{\bar{L}}}\hskip 0.86108pt|\hskip 0.86108pt{\mathrm{f}}\rangle_{\bar{L}}$ . Then there exists a unitary $\hat{U}_{M}$ such that $[\hat{U}_{M},\hat{H}_{M}]=0$ and

[TABLE]

where $\hat{\Pi}_{K}=\hat{V}^{\dagger}\hat{V}$ is the projector onto the support of $\hat{V}$ .

Furthermore, we can remove $\hat{\Pi}_{K}$ from (193) under the following additional assumption. Let $\{(\alpha_{j},\mu_{j})\}_{j=1}^{J}$ be the energy eigenvalues with the corresponding multiplicities of all energy eigenstates of $K$ that are in the kernel of $\hat{V}$ . Let $\{(\beta_{\ell},\nu_{\ell})\}_{\ell}$ be the energy eigenvalues with the corresponding multiplicities of all energy eigenstates $\lvert{\beta}\rangle_{L\bar{L}}$ of $\hat{H}_{L}+\hat{H}_{\bar{L}}$ that have no overlap with $\hat{I}_{L}\otimes\lvert{\mathrm{f}}\rangle\langle{\mathrm{f}}\rvert_{\bar{L}}$ , i.e., for which $(\hat{I}_{L}\otimes\lvert{\mathrm{f}}\rangle\langle{\mathrm{f}}\rvert_{\bar{L}})\,\lvert{\beta}\rangle_{L\bar{L}}=0$ . Suppose that for each $(\alpha_{j},\mu_{j})$ (for $j=1,\ldots,J$ ), there exists a corresponding $\ell$ with $\beta_{\ell}=e_{\mathrm{i}}+\alpha_{j}$ and $\nu_{\ell}\geqslant\mu_{j}$ . Then there exists a unitary operator $\hat{U}_{M}$ with $[\hat{H}_{M},\hat{U}_{M}]=0$ and such that

[TABLE]

Before delving into the proof of Proposition 13 we issue a few remarks to provide a better picture of the consequences of this general proposition and to identify a few interesting special cases.

(a)

The operator $\hat{\Pi}_{K}$ in (193) can be replaced by an operator $\hat{\Pi}^{\prime}_{L}$ acting after the unitaries, where $\hat{\Pi}^{\prime}_{L}=\hat{V}\hat{V}^{\dagger}$ is the projector onto the range of $\hat{V}$ . 2. (b)

If $K=L$ and $H_{K}=H_{L}$ , we can choose $M=K=L$ with trivial systems $\bar{K},\bar{L}$ . With this choice of $M$ , the projector in (193) is always necessary unless $\hat{V}_{K\to L}$ is already unitary. (The projector can be removed by choosing a larger system $M$ , see below.) 3. (c)

The additional assumption in the second part of the proposition amounts to requiring that, for the given $\lvert{\mathrm{i}}\rangle_{\bar{K}},\lvert{\mathrm{f}}\rangle_{\bar{L}}$ , it is possible to map the support of $(\hat{I}_{K}-\hat{\Pi}_{K})\otimes\lvert{\mathrm{i}}\rangle\langle{\mathrm{i}}\rvert_{\bar{K}}$ (i.e., the space spanned by all eigenstates outside of the support of $\hat{V}_{K\to L}$ and tensored with $\lvert{\mathrm{i}}\rangle_{\bar{K}}$ ), into a space of the global output system such that the mapping is energy conserving and such that the resulting space has no overlap with $\lvert{\mathrm{f}}\rangle_{\bar{L}}$ . As long as the input state on $\bar{K}$ is initialized in the state $\lvert{\mathrm{i}}\rangle_{\bar{K}}$ , then projecting the output onto $\lvert{\mathrm{f}}\rangle_{\bar{L}}$ automatically ensures that the input state already lies within the projector $\hat{\Pi}^{\prime}_{L}$ . (Equivalently, the projector $\hat{\Pi}_{K}$ on the input becomes redundant.) 4. (d)

For any $K,L$ and for a general choice of $M$ , $\bar{K}$ , $\bar{L}$ with corresponding Hamiltonians along with energy-preserving embedding unitaries $\hat{U}^{\prime}_{K\bar{K}\to M}$ , $\hat{U}^{\prime\prime}_{L\bar{L}\to M}$ , there always exists a qubit system $Q$ with some Hamiltonian $H_{Q}$ such that there exist $\lvert{\mathrm{i}}\rangle_{\bar{K}Q}$ and $\lvert{\mathrm{f}}\rangle_{\bar{L}Q}$ with the same eigenenergy.

This statement is shown as follows. We first pick any two energy eigenstates $\lvert{\mathrm{i}_{0}}\rangle_{\bar{K}}$ and $\lvert{\mathrm{f}_{0}}\rangle_{\bar{L}}$ of respective energies $e_{\mathrm{i}}$ and $e_{\mathrm{f}}$ . We then introduce a qubit $Q$ with the Hamiltonian $H_{Q}=q_{0}\lvert{0}\rangle\langle{0}\rvert+q_{1}\lvert{1}\rangle\langle{1}\rvert$ , with $q_{0}=c-e_{\mathrm{i}}$ and $q_{1}=c-e_{\mathrm{f}}$ for any chosen constant $c$ . Define $M^{\prime}=M\otimes Q$ , $\bar{K}^{\prime}=\bar{K}\otimes Q$ , $\bar{L}^{\prime}=\bar{L}\otimes Q$ , etc., along with $\lvert{\mathrm{i}}\rangle_{\bar{K}^{\prime}}=\lvert{\mathrm{i}_{0}}\rangle_{\bar{K}}\otimes\lvert{0}\rangle_{Q}$ and $\lvert{\mathrm{f}}\rangle_{\bar{L}^{\prime}}=\lvert{\mathrm{f}_{0}}\rangle_{\bar{L}}\otimes\lvert{1}\rangle_{Q}$ , observing that $\lvert{\mathrm{i}}\rangle_{\bar{K}^{\prime}}$ and $\lvert{\mathrm{f}}\rangle_{\bar{L}^{\prime}}$ are both energy eigenstates with energy $c$ . 5. (e)

For any $M,K,L,\bar{K},\bar{L},\lvert{\mathrm{i}}\rangle_{\bar{K}},\lvert{\mathrm{f}}\rangle_{\bar{L}}$ satisfying the first part of the proposition, we can always introduce a qubit system $Q^{\prime}$ with a degenerate Hamiltonian $H_{Q^{\prime}}=c^{\prime}$ for some arbitrary constant $c^{\prime}$ , and define $M^{\prime\prime}=M\otimes Q^{\prime}$ , $\bar{K}^{\prime\prime}=\bar{K}\otimes Q^{\prime}$ , $\bar{L}^{\prime\prime}=\bar{L}\otimes Q^{\prime}$ , etc., along with $\lvert{\mathrm{i}^{\prime}}\rangle_{\bar{K}^{\prime\prime}}=\lvert{\mathrm{i}}\rangle_{\bar{K}}\otimes\lvert{0}\rangle_{Q^{\prime}}$ and $\lvert{\mathrm{f}^{\prime}}\rangle_{\bar{L}^{\prime\prime}}=\lvert{\mathrm{f}}\rangle_{\bar{L}}\otimes\lvert{1}\rangle_{Q^{\prime}}$ , such that the additional condition of the second part of the proposition is satisfied. Indeed, from the unitary $\hat{U}_{M}$ given by the proposition without the extra qubit, we can define $\hat{\tilde{U}}_{MQ^{\prime}}=(\hat{U}_{M}\otimes\hat{I}_{Q^{\prime}})((\hat{\Pi}_{K}\otimes\hat{I}_{\bar{K}}\otimes(\lvert{0}\rangle\langle{1}\rvert+\lvert{0}\rangle\langle{1}\rvert)_{Q^{\prime}}+((\hat{I}_{K}-\hat{\Pi}_{K})\otimes\hat{I}_{\bar{K}Q^{\prime}})$ , i.e., $\hat{\tilde{U}}_{MQ^{\prime}}$ conditionally flips the bit $Q^{\prime}$ if the input on $K$ is in the support of $\hat{V}_{K\to L}$ , before applying $\hat{U}_{M}$ . The effect of $\tilde{\hat{U}}_{M^{\prime}}$ is to map all states of the form $\lvert{\psi^{\prime}}\rangle_{K}\otimes\lvert{\mathrm{i}}\rangle_{\bar{K}}\otimes\lvert{0}\rangle_{Q^{\prime}}$ onto states with the $Q^{\prime}$ system remaining in the state $\lvert{0}\rangle_{Q^{\prime}}$ , ensuring that there is no overlap with $\lvert{\mathrm{f}^{\prime}}\rangle_{\bar{L}^{\prime}}$ . 6. (f)

The qubits introduced in Points (d) and (e) may evidently be chosen to be larger systems that contain such qubits as subspaces. 7. (g)

For any $K,L$ and for a general choice of $M$ , $\bar{K}$ , $\bar{L}$ with corresponding Hamiltonians along with energy-preserving embedding unitaries $\hat{U}^{\prime}_{K\bar{K}\to M}$ , $\hat{U}^{\prime\prime}_{L\bar{L}\to M}$ , there might not always exist $\lvert{\mathrm{i}}\rangle_{\bar{K}}$ and $\lvert{\mathrm{f}}\rangle_{\bar{L}}$ with the same eigenenergy, even if $\hat{V}_{K\to L}\neq 0$ . As a counterexample, consider systems $K,L,\bar{K},\bar{L}$ where the system $K$ has energy levels $\{0,1\}$ , the system $\bar{K}$ has levels $\{0,1\}$ , the system $L$ has levels $\{-2,-1,-1,0\}$ , and the system $\bar{L}$ is trivial with the single level $\{2\}$ . In both cases, the joint energy levels are $\{0,1,1,2\}$ , and $\hat{V}_{K\to L}$ can be nonzero by mapping the [math] level of $K$ to the [math] level of $L$ . Yet, $\bar{K}$ and $\bar{L}$ do not share an energy level of same energy. 8. (h)

For arbitrary $K,L$ , a simple choice for the system $M$ is $M=K\otimes L$ with $\bar{K}=L$ , $\hat{H}_{\bar{K}}=\hat{H}_{L}$ , $\bar{L}=K$ , $\hat{H}_{\bar{L}}=\hat{H}_{K}$ , along with the trivial identity embedding maps $\hat{U}^{\prime}_{K\bar{K}\to M}=\hat{I}_{KL\to M}$ , $\hat{U}^{\prime}_{L\bar{L}\to M}=\hat{I}_{KL\to M}$ . There always exist $\lvert{\mathrm{i}}\rangle_{\bar{K}}$ and $\lvert{\mathrm{f}}\rangle_{\bar{L}}$ with the same eigenenergy (as long as $\hat{V}_{K\to L}\neq 0$ ), by picking an eigenstate in the support of $\hat{V}_{K\to L}$ along with its associated image under $\hat{V}_{K\to L}$ .

Furthermore, with this choice it is always possible to satisfy our additional condition leading to (194). This can be seen as follows. Let $m=\operatorname{rank}(\hat{V})$ . We choose energy eigenbases $\{\lvert{u_{j}}\rangle_{K}\}_{j=1}^{d_{K}}$ of $K$ and $\{\lvert{v_{j^{\prime}}}\rangle_{L}\}_{j^{\prime}=1}^{d_{L}}$ of $L$ , with $\{\lvert{u_{j}}\rangle_{K}\}_{j=1}^{m}$ spanning the support of $\hat{V}$ and with $\lvert{v_{j}}\rangle_{L}=\hat{V}_{K\to L}\,\lvert{u_{j}}\rangle_{K}$ for those $j=1,\ldots,m$ . Then we choose $\lvert{\mathrm{i}}\rangle_{L}=\lvert{v_{1}}\rangle_{L}$ and $\lvert{\mathrm{f}}\rangle_{K}=\lvert{u_{1}}\rangle_{L}$ (assuming $\hat{V}\neq 0$ ), noting that they must have the same energy. We see that all states of the form $\lvert{u_{j}}\rangle_{K}\otimes\lvert{\mathrm{i}}\rangle_{L}$ for $j>m$ can be mapped onto themselves, with clearly $(\langle{\mathrm{f}}\rvert_{K}\otimes\hat{I}_{L})(\lvert{u_{j}}\rangle_{K}\otimes\lvert{\mathrm{i}}\rangle_{L})=0$ because $\lvert{\mathrm{f}}\rangle_{K}=\lvert{u_{1}}\rangle_{K}\perp\lvert{u_{j}}\rangle_{K}$ . 9. (i)

In the case of the generalized thermal operation depicted in Fig. 1, we have $K=SB$ and $L=S^{\prime}B$ , with a given energy-conserving partial isometry $\hat{V}_{SB\to S^{\prime}B}$ . In this case, we may choose $M=SS^{\prime}B$ , with $\bar{K}=S^{\prime}$ and $\bar{L}=S$ . If necessary, we can enlarge $\bar{K}$ and $\bar{L}$ to include qubit systems $Q$ and/or $Q^{\prime}$ as per Points (d) and (e) to ensure that all the conditions of Proposition 13 are satisfied. Then there exists $\lvert{\mathrm{i}}\rangle_{\bar{K}}$ , $\lvert{\mathrm{f}}\rangle_{\bar{L}}$ , along with an energy-conserving unitary $\hat{U}_{SB\bar{K}\to S^{\prime}B\bar{L}}$ , such that

[TABLE]

We now turn to the proof of the proposition.

Proof of Proposition 13..

First we compute as in Proposition 12 the commutators $[\hat{V}^{\dagger}\hat{V},\hat{H}_{K}]=\hat{V}^{\dagger}\hat{V}\hat{H}_{K}-\hat{H}_{K}\hat{V}^{\dagger}\hat{V}=\hat{V}^{\dagger}(\hat{H}_{L}-\hat{H}_{L})\hat{V}=0$ and $[\hat{V}\hat{V}^{\dagger},\hat{H}_{L}]=\hat{V}\hat{V}^{\dagger}\hat{H}_{L}-\hat{H}_{L}\hat{V}\hat{V}^{\dagger}=\hat{V}(\hat{H}_{K}-\hat{H}_{K})\hat{V}^{\dagger}=0$ , as well as $[\hat{V}^{\dagger}\hat{H}_{L}\hat{V},\hat{H}_{K}]=\hat{V}^{\dagger}\hat{H}_{L}\hat{V}\hat{H}_{K}-\hat{H}_{K}\hat{V}^{\dagger}\hat{H}_{L}\hat{V}=\hat{V}^{\dagger}(\hat{H}_{L}^{2}-\hat{H}_{L}^{2})\hat{V}=0$ and $[\hat{V}^{\dagger}\hat{H}_{L}\hat{V},\hat{V}^{\dagger}\hat{V}]=\hat{V}^{\dagger}\hat{H}_{L}\hat{V}\hat{V}^{\dagger}\hat{V}-\hat{V}^{\dagger}\hat{V}\hat{V}^{\dagger}\hat{H}_{L}\hat{V}=\hat{V}^{\dagger}[\hat{H}_{L},\hat{V}\hat{V}^{\dagger}]\hat{V}=0$ .

Because $\hat{U}^{\prime}_{K\bar{K}\to M}$ and $\hat{U}^{\prime\prime}_{L\bar{L}\to M}$ are unitary we must have $d_{K}d_{\bar{K}}=d_{M}=d_{L}d_{\bar{L}}$ . Also, the operator $\hat{U}_{L\bar{L}\leftarrow M}^{\prime\prime\dagger}\hat{U}^{\prime}_{K\bar{K}\to M}$ is an energy-conserving unitary operator from $K\bar{K}$ to $L\bar{L}$ ; therefore, the Hamiltonians $H_{K}+H_{\bar{K}}$ and $H_{L}+H_{\bar{L}}$ must have the same eigenvalues and with the same multiplicity.

Let $\hat{W}_{M}=\hat{U}^{\prime\prime}_{L\bar{L}\to M}\,(\hat{V}_{K\to L}\otimes\lvert{\mathrm{f}}\rangle_{\bar{L}}\langle{\mathrm{i}}\rvert_{\bar{K}})\,\hat{U}_{K\bar{K}\leftarrow M}^{\prime\dagger}$ , noting that $\hat{W}_{M}$ is a partial isometry. Furthermore, $\hat{W}_{M}\hat{H}_{M}=\hat{U}^{\prime\prime}_{L\bar{L}\to M}\,(\hat{V}_{K\to L}\otimes\lvert{\mathrm{f}}\rangle_{\bar{L}}\langle{\mathrm{i}}\rvert_{\bar{K}})\,(\hat{H}_{K}+\hat{H}_{\bar{K}})\,\hat{U}_{K\bar{K}\leftarrow M}^{\prime\dagger}=\ldots=\hat{H}_{M}\hat{W}_{M}$ , recalling that $\lvert{\mathrm{i}}\rangle_{\bar{K}}$ and $\lvert{\mathrm{f}}\rangle_{\bar{L}}$ have the same eigenvalue with respect to $\hat{H}_{K}$ and $\hat{H}_{\bar{L}}$ , respectively; therefore $[\hat{W}_{M},H_{M}]=0$ .

We can complete $\hat{W}_{M}$ into a fully energy-conserving unitary $\hat{U}_{M}$ by assigning to each input energy eigenstate an energy eigenstate of same energy at the output; this association is possible since the eigenvalues of the input and output systems coincide including with multiplicity. Then (193) is satisfied by construction, as can be checked by verifying the action of both sides of the equation on an energy eigenbasis spanning the support of $\hat{V}_{K\to L}$ .

Now we assume that the additional condition stated in the claim holds, in order to prove (194).

Let $\{\lvert{\phi_{j}}\rangle_{M}\}_{j=1}^{d_{M}}$ be a basis of $M$ that is a simultaneous eigenbasis of $\hat{H}_{M}$ , $\hat{U}^{\prime}_{K\bar{K}\to M}\,(\hat{V}^{\dagger}\hat{V}\otimes\lvert{\mathrm{i}}\rangle\langle{\mathrm{i}}\rvert_{\bar{K}})\,\hat{U}_{K\bar{K}\leftarrow M}^{\prime\dagger}$ , and $\hat{U}^{\prime}_{K\bar{K}\to M}\,(\hat{V}^{\dagger}\hat{H}_{L}\hat{V}\otimes\lvert{\mathrm{i}}\rangle\langle{\mathrm{i}}\rvert_{\bar{K}})\,\hat{U}_{K\bar{K}\leftarrow M}^{\prime\dagger}$ , and furthermore chosen such that (i) the states $\{\lvert{\phi_{j}}\rangle_{M}\}_{j=1}^{\operatorname{rank}(\hat{V})}$ span the support of $\hat{U}^{\prime}_{K\bar{K}\to M}\,(\hat{V}_{K}\otimes\lvert{\mathrm{i}}\rangle\langle{\mathrm{i}}\rvert_{\bar{K}})\,\hat{U}_{K\bar{K}\leftarrow M}^{\prime\dagger}$ , and (ii) the set $\{\lvert{\phi_{j}}\rangle_{M}\}_{j=\operatorname{rank}(\hat{V})+1}^{d_{K}}$ spans the subspace supported by $\hat{U}^{\prime}_{K\bar{K}\to M}\,\bigl{(}(\hat{I}_{K}-\hat{\Pi}_{K})\otimes\lvert{\mathrm{i}}\rangle\langle{\mathrm{i}}\rvert_{\bar{K}}\bigr{)}\,\hat{U}_{K\bar{K}\leftarrow M}^{\prime\dagger}$ .

Let $\{\lvert{\chi_{j^{\prime}}}\rangle_{M}\}_{j^{\prime}=1}^{d_{M}}$ be another basis of $M$ that is a simultaneous eigenbasis of $\hat{H}_{M}$ and $\hat{U}^{\prime\prime}_{L\bar{L}\to M}\,(\hat{V}\hat{V}^{\dagger}\otimes\lvert{\mathrm{f}}\rangle\langle{\mathrm{f}}\rvert_{\bar{L}})\,\hat{U}_{L\bar{L}\leftarrow M}^{\prime\prime\dagger}$ , and furthermore chosen such that (i) we have $\lvert{\chi_{j^{\prime}}}\rangle_{M}=\hat{U}^{\prime\prime}_{L\bar{L}\to M}(\hat{V}_{K\to L}\lvert{\phi_{j^{\prime}}}\rangle_{M}\otimes\lvert{\mathrm{f}}\rangle_{\bar{L}})$ for all $j^{\prime}=1,\ldots,\operatorname{rank}(\hat{V})$ , (ii) we have that the set $\{\lvert{\chi_{j^{\prime}}}\rangle_{M}\}_{j^{\prime}=\operatorname{rank}(\hat{V})+1}^{d_{K}}$ is orthogonal to $\hat{U}^{\prime\prime}_{L\bar{L}\to M}\,(\hat{I}_{L}\otimes\lvert{\mathrm{f}}\rangle\langle{\mathrm{f}}\rvert_{\bar{L}})\,\hat{U}_{L\bar{L}\to M}^{\prime\prime\dagger}$ and (iii) we also have that $\lvert{\chi_{j}}\rangle_{M}$ for $j=\operatorname{rank}(\hat{V})+1,\ldots,d_{K}$ is an energy eigenstate with the same energy as $\lvert{\phi_{j}}\rangle_{M}$ , which we can ensure thanks to our additional assumption stated in the claim.

Then we define $\hat{U}_{M}$ as

[TABLE]

The operator $\hat{U}_{M}$ is unitary and commutes with $\hat{H}_{M}$ , since it maps an energy eigenbasis onto an energy eigenbasis. If the state $\lvert{\psi}\rangle_{K}$ is in the support of $\hat{V}$ , we have that $\hat{U}^{\prime}_{K\bar{K}\to M}(\lvert{\psi}\rangle_{K}\otimes\lvert{\mathrm{i}}\rangle_{\bar{K}})=\sum_{j=1}^{\operatorname{rank}(\hat{V})}c_{j}\lvert{\phi_{j}}\rangle_{M}$ for suitable complex coefficients $c_{j}$ . Then

[TABLE]

If the state $\lvert{\psi^{\prime}}\rangle_{K}$ lies outside the support of $\hat{V}$ , we have that $\hat{U}^{\prime}_{K\bar{K}\to M}(\lvert{\psi^{\prime}}\rangle_{K}\otimes\lvert{\mathrm{i}}\rangle_{\bar{K}})=\sum_{j=\operatorname{rank}(\hat{V})+1}^{d_{K}}c^{\prime}_{j}\lvert{\phi_{j}}\rangle_{M}$ for suitable complex coefficients $c_{j}^{\prime}$ , and

[TABLE]

We have therefore proven (194). ∎

Now we present some general properties of the thermodynamic operations introduced in Section III.1.

Proposition 14 (Elementary properties of thermodynamic operations).

Consider systems $S,S^{\prime}$ with corresponding Hamiltonians $\hat{H}_{S},\hat{H}^{\prime}_{S^{\prime}}$ . Let $*$ denote either TO or GPM. The following hold:

(a)

If $S\simeq S^{\prime}$ and $H^{\prime}_{S^{\prime}}=H_{S}+c$ for some $c\in\mathbb{R}$ , the identity process is a $(c,0)$ -work/coherence-assisted process in either model TO or GPM; 2. (b)

For two energy eigenstates $\lvert{E}\rangle_{S},\lvert{E^{\prime}}\rangle_{S^{\prime}}$ , we have $\lvert{E}\rangle_{S}\xrightarrow[*]{w,0,0}\lvert{E^{\prime}}\rangle_{S^{\prime}}$ if and only if $w\geqslant E^{\prime}-E$ ; 3. (c)

For any $w\in\mathbb{R},\eta>0,\varepsilon>0$ , we have $\hat{\rho}_{S}\otimes\lvert{E}\rangle\langle{E}\rvert_{A}\xrightarrow[*]{w,\eta,\varepsilon}\hat{\rho}^{\prime}_{S^{\prime}}\otimes\lvert{E^{\prime}}\rangle\langle{E^{\prime}}\rvert_{A^{\prime}}$ for energy eigenstates on ancillas $A,A^{\prime}$ if and only if $\hat{\rho}_{S}\xrightarrow[*]{E+w-E^{\prime},\eta,\varepsilon}\hat{\rho}^{\prime}_{S^{\prime}}$ ; 4. (d)

We have $\hat{\gamma}\xrightarrow[*]{F^{\prime}-F,\,0,\,0}\hat{\gamma}^{\prime}$ , where $\hat{\gamma}=e^{\beta(F-\hat{H})}$ , $\hat{\gamma}^{\prime}=e^{\beta(F^{\prime}-\hat{H}^{\prime})}$ with $F=-\beta^{-1}\ln\operatorname{tr}(e^{-\beta\hat{H}})$ , $F^{\prime}=-\beta^{-1}\ln\operatorname{tr}(e^{-\beta\hat{H}^{\prime}})$ ; 5. (e)

$\hat{\rho}\xrightarrow[*]{w,\,\eta,\,\varepsilon}\hat{\rho}^{\prime}$ * implies $\hat{\rho}\xrightarrow[*]{w^{\prime},\,\eta^{\prime},\,\varepsilon^{\prime}}\hat{\rho}^{\prime}$ for any $w^{\prime}\geqslant w$ , $\eta^{\prime}\geqslant\eta$ and $\varepsilon^{\prime}\geqslant\varepsilon$ ;* 6. (f)

If $\hat{\rho}\xrightarrow[*]{w,\,\eta,\,\varepsilon}\hat{\rho}^{\prime}$ and $\hat{\rho}^{\prime}\xrightarrow[*]{w^{\prime},\,\eta^{\prime},\,\varepsilon^{\prime}}\hat{\rho}^{\prime\prime}$ , then $\hat{\rho}\xrightarrow[*]{w+w^{\prime},\,\eta+\eta^{\prime},\,\varepsilon+\varepsilon^{\prime}}\hat{\rho}^{\prime\prime}$ .

Proof.

Property (a) for $H^{\prime}=H$ is obvious because the identity process is itself both a thermal operation and a Gibbs preserving map. For $H^{\prime}_{S^{\prime}}=H_{S}+c\hat{I}$ with $c\neq 0$ we use a two-level battery $W$ with energy eigenstates $\lvert{0}\rangle_{W},\lvert{c}\rangle_{W}$ and $H_{W}=c\,\lvert{c}\rangle\langle{c}\rvert_{W}$ ; then $\hat{I}_{S\to S^{\prime}}\otimes\lvert{0}\rangle\langle{c}\rvert$ is an energy-conserving partial isometry, and thus a thermal operation, on the system $S$ and the battery $W$ with $c$ work expended. The statement in the GPM model follows from Lemma 1. Property (b) is clear; the only nontrivial aspect is that we may have strict inequality. That a thermal operation can perform this transformation can be seen using thermo-majorization Horodecki and Oppenheim (2013). The statement for GPM follows because a thermal operation is also Gibbs-preserving. Property (c) holds by definition of a $(w,\eta)$ -work/coherence-assisted process; the systems $A,A^{\prime}$ may be combined together with the battery system $W$ in the transformation. Property (d) holds because the thermo-majorization curve of the thermal state is the line connecting $(0,0)$ to $(e^{\beta F},1)$ Horodecki and Oppenheim (2013). Property (e) follows from (b). To show Property (f), let $\Phi$ (respectively $\Phi^{\prime}$ ) be a work/coherence-assisted-process with parameters $(w,\eta)$ (respectively $(w^{\prime},\eta^{\prime})$ ). Then $\Phi^{\prime}\circ\Phi$ is a $(w+w^{\prime},\eta+\eta^{\prime})$ -work/coherence-assisted process, and we have $D(\Phi^{\prime}(\Phi(\hat{\rho})),\hat{\rho}^{\prime\prime})\leqslant D(\Phi^{\prime}(\Phi(\hat{\rho})),\Phi^{\prime}(\hat{\rho}^{\prime}))+D(\Phi^{\prime}(\hat{\rho}^{\prime}),\hat{\rho}^{\prime\prime})\leqslant D(\Phi(\hat{\rho}),\hat{\rho}^{\prime})+D(\Phi^{\prime}(\hat{\rho}^{\prime}),\hat{\rho}^{\prime\prime})\leqslant\varepsilon+\varepsilon^{\prime}$ . ∎

Now we present the proofs of Propositions 3 and 4 stated in Section III.1 regarding the monotonicity of the various divergences under thermodynamic operations.

Proof of Proposition 3.

We have $\hat{\rho}_{S}\xrightarrow[\mathrm{GPM}]{}\hat{\rho}_{S^{\prime}}^{\prime}$ (invoking Lemma 1 if necessary); let $\Phi^{[\mathrm{GPM}]}$ be the corresponding Gibbs-sub-preserving map. The monotonicity of the hypothesis testing divergence follows directly from the properties (19) and (21).

The monotonicity of the Rényi divergences is trickier to prove because the corresponding data processing inequality only holds for trace-preserving mappings. Using (Faist and Renner, 2018, Proposition 2), there exists a qubit system $Q$ with a basis $\{\lvert{\mathrm{i}}\rangle_{Q},\lvert{\mathrm{f}}\rangle_{Q}\}$ and with a Hamiltonian $H_{Q}=q_{\mathrm{i}}\lvert{\mathrm{i}}\rangle\langle{\mathrm{i}}\rvert+q_{\mathrm{f}}\lvert{\mathrm{f}}\rangle\langle{\mathrm{f}}\rvert_{Q}$ , as well as eigenstates $\lvert{\mathrm{i}}\rangle_{S^{\prime}},\lvert{\mathrm{f}}\rangle_{S}$ of $\hat{H}_{S},\hat{H}^{\prime}_{S^{\prime}}$ , and a trace-preserving map $\mathcal{K}^{[\mathrm{GPM}]}_{SS^{\prime}Q\to SS^{\prime}Q}$ such that

[TABLE]

Since $\operatorname{tr}(\Phi^{[\mathrm{GPM}]}(\hat{\rho}_{S}))=\operatorname{tr}(\hat{\rho}^{\prime}_{S^{\prime}})=1$ , we can invoke (Faist and Renner, 2018, Corollary 3(b)) to see that

[TABLE]

Also, using (Faist and Renner, 2018, Proposition 17) and (199c), we have that

[TABLE]

Then using the property (14) of the Rényi $\alpha$ -entropies and the above identities, we have

[TABLE]

where the inequality holds by the data processing inequality (10). ∎

Proof

of Proposition 4.

We prove the statement for the GPM model, invoking Lemma 1 if necessary. Let $C,C^{\prime},W,W^{\prime}$ be systems with Hamiltonians $\hat{H}_{C},\hat{H}_{C^{\prime}},\hat{H}_{W},\hat{H}_{W^{\prime}}$ from Definition 5 and let $\tilde{\Phi}^{[\mathrm{GPM}]}_{SCW\to S^{\prime}C^{\prime}W^{\prime}}$ be the GPM operation in (38). Let $\hat{\tilde{\rho}}_{S^{\prime}C^{\prime}W^{\prime}}=\tilde{\Phi}^{[\mathrm{GPM}]}_{SCW\to{}S^{\prime}C^{\prime}W^{\prime}}(\hat{\rho}_{S}\otimes\lvert{E}\rangle\langle{E}\rvert_{W}\otimes\lvert{\zeta}\rangle\langle{\zeta}\rvert_{C})$ , with $D\bigl{(}\langle{E^{\prime},\zeta^{\prime}}\rvert_{W^{\prime}C^{\prime}}\hat{\tilde{\rho}}_{S^{\prime}C^{\prime}W^{\prime}}\lvert{E^{\prime},\zeta^{\prime}}\rangle_{W^{\prime}C^{\prime}}\,,\,\hat{\rho}^{\prime}_{S^{\prime}}\bigr{)}\leqslant\varepsilon$ . Using property (22) we have

[TABLE]

Now compute

[TABLE]

because $\langle{\zeta^{\prime}}\hskip 0.86108pt|\hskip 0.86108pt{e^{-\beta H_{C}}}\hskip 0.86108pt|\hskip 0.86108pt{\zeta^{\prime}}\rangle\geqslant\lambda_{\mathrm{min}}(e^{-\beta H_{C}})\geqslant e^{-\beta\lVert{H_{C}}\rVert_{\infty}}\geqslant e^{-\beta\eta}$ where $\lambda_{\mathrm{min}}(\cdot)$ denotes the smallest eigenvalue of its argument. Observe that the operation $\operatorname{tr}_{C^{\prime}W^{\prime}}\bigl{[}\lvert{E^{\prime},\zeta^{\prime}}\rangle\langle{E^{\prime},\zeta^{\prime}}\rvert_{W^{\prime}C^{\prime}}\,(\cdot)\bigr{]}$ is a completely positive, trace-nonincreasing map. Then thanks to (19) and (204) along with the scaling property (20),

[TABLE]

where the two last inequalities hold using respectively (21) noting that $\tilde{\Phi}^{[\mathrm{GPM}]}_{SCW\to S^{\prime}C^{\prime}W^{\prime}}$ is Gibbs-sub-preserving, and the data processing inequality (19).

Let $\hat{Q}_{SCW}$ with $0\leqslant\hat{Q}_{SCW}\leqslant\hat{I}$ be an optimal choice for the last divergence term in (205), such that ${S}_{\mathrm{H}}^{\xi}(\hat{\rho}_{S}\otimes\lvert{E,\zeta}\rangle\langle{E,\zeta}\rvert_{WC}\,\|\,e^{-\beta(\hat{H}_{S}+\hat{H}_{W}+\hat{H}_{C})})=-\ln\operatorname{tr}(\hat{Q}_{SCW}e^{-\beta(\hat{H}_{S}+\hat{H}_{W}+\hat{H}_{C})})$ . Let $\hat{Q}^{\prime}_{S}=\langle{E,\zeta}\rvert_{WC}\,\hat{Q}_{SCW}\,\lvert{E,\zeta}\rangle_{WC}$ , noting that $0\leqslant\hat{Q}^{\prime}_{S}\leqslant\hat{I}_{S}$ . Then we have $\operatorname{tr}(\hat{Q}^{\prime}_{S}\hat{\rho}_{S})=\operatorname{tr}\bigl{(}\hat{Q}_{SCW}\,(\hat{\rho}_{S}\otimes\lvert{E,\zeta}\rangle\langle{E,\zeta}\rvert_{WC})\bigr{)}\geqslant\xi$ , and thus

[TABLE]

where in the last inequality we used $e^{-\beta\hat{H}_{C}}\geqslant\lambda_{\mathrm{min}}(e^{-\beta\hat{H}_{C}})\lvert{\zeta}\rangle\langle{\zeta}\rvert_{C}\geqslant e^{-\beta\lVert{\hat{H}_{C}}\rVert_{\infty}}\lvert{\zeta}\rangle\langle{\zeta}\rvert_{C}\geqslant e^{-\beta\eta}\lvert{\zeta}\rangle\langle{\zeta}\rvert_{C}$ and $e^{-\beta\hat{H}_{W}}\geqslant e^{-\beta E}\lvert{E}\rangle\langle{E}\rvert_{W}$ which imply together that $\lvert{E,\zeta}\rangle\langle{E,\zeta}\rvert_{WC}\leqslant e^{\beta(E+\eta)}\,e^{-\beta(\hat{H}_{C}+\hat{H}_{W})}$ . Rewriting (206), we have

[TABLE]

and finally,

[TABLE]

Following the chain of inequalities proves the claim. ∎

We present a convenient lemma that can ensure asymptotic convertibility if good enough asymptotic convertibility can be achieved for any fixed $\varepsilon>0$ . We first note that, thanks to Property (e) of Proposition 14, we may equivalently replace all limits “ $\lim_{n\to\infty}$ ” in Definition 7 by “ $\limsup_{n\to\infty}$ ”.

Lemma 13.

For sequences of states $\widehat{P}=\{\hat{\rho}_{n}\}$ , $\widehat{P}^{\prime}=\{\hat{\rho}^{\prime}_{n}\}$ and sequences of Hamiltonians $\widehat{\mathcal{H}}=\{\hat{H}_{n}\}$ , $\widehat{\mathcal{H}}^{\prime}=\{\hat{H}^{\prime}_{n}\}$ , suppose that for all $\varepsilon>0$ there exists $w_{n,\varepsilon},\eta_{n,\varepsilon},\bar{\varepsilon}_{n,\varepsilon}$ such that $\hat{\rho}_{n}\xrightarrow[*]{w_{n,\varepsilon},\,\eta_{n,\varepsilon},\,\bar{\varepsilon}_{n,\varepsilon}}\hat{\rho}^{\prime}_{n}$ for all $n$ , where $*$ denotes TO or GPM. If $r\in\mathbb{R}$ is such that

[TABLE]

then $\widehat{P}\xrightarrow[*]{r}\widehat{P}^{\prime}$ .

Proof.

Let $w_{\varepsilon}:=\limsup_{n\to\infty}w_{n,\varepsilon}/n$ , $\eta_{\varepsilon}:=\limsup_{n\to\infty}\eta_{n,\varepsilon}/n$ , and $\bar{\varepsilon}_{\varepsilon}:=\limsup_{n\to\infty}\bar{\varepsilon}_{n,\varepsilon}$ . Define

[TABLE]

Now let $\varepsilon(n):=\inf\{\varepsilon:N(\varepsilon)\leqslant n\}$ and observe that $\lim_{n\to\infty}\varepsilon(n)=0$ because $N(\varepsilon)$ is finite for any small $\varepsilon>0$ thanks to the existence of the limit superior defining $w_{\varepsilon}$ , $\eta_{\varepsilon}$ and $\bar{\varepsilon}_{\varepsilon}$ . Then let $w_{n}:=w_{n,\varepsilon(n)}$ , $\eta_{n}:=\eta_{n,\varepsilon(n)}$ , and $\bar{\varepsilon}_{n}:=\bar{\varepsilon}_{n,\varepsilon(n)}$ , such that $\hat{\rho}_{n}\xrightarrow[*]{w_{n},\,\eta_{n},\,\bar{\varepsilon}_{n}}\hat{\rho}^{\prime}_{n}$ for all $n$ . We have $w_{n}/n=w_{n,\varepsilon(n)}/n\leqslant w_{\varepsilon(n)}+\varepsilon(n)$ by definition of $\varepsilon(n)$ and hence $\limsup_{n\to\infty}w_{n}/n\leqslant\limsup_{n\to\infty}[w_{\varepsilon(n)}+\varepsilon(n)]=r$ . Similarly, $\eta_{n}/n=\eta_{n,\varepsilon(n)}/n\leqslant\eta_{\varepsilon}+\varepsilon$ and thus $\lim_{n\to\infty}\eta_{n}/n=0$ . Also, $\bar{\varepsilon}_{n}=\bar{\varepsilon}_{n,\varepsilon(n)}\leqslant\bar{\varepsilon}_{\varepsilon}+\varepsilon$ and thus $\lim_{n\to\infty}\bar{\varepsilon}_{n}=0$ . ∎

An important known result is the fact that the min and max divergences quantify the amount of work that is necessary to convert a semiclassical state $\hat{\rho}$ to and from the thermal state.

Proposition 15 (Work distillation and state formation for semiclassical

states Åberg (2013); Horodecki and Oppenheim (2013)).

Let $\hat{\rho}$ be a quantum state on a system with Hamiltonian $\hat{H}$ , and suppose that $[\hat{\rho},\hat{H}_{S}]=0$ . Let $\gamma^{\prime\prime}=1$ denote the trivial thermal state on the trivial system $\mathbb{C}$ with the trivial Hamiltonian $\hat{H}^{\prime\prime}=0$ . Then

[TABLE]

We now present a central proposition of this appendix, namely a simplified form of Theorem 1 that is specific to Gibbs-preserving maps. The error terms as well as the proof itself are significantly simpler than the full result for thermal operations.

Proposition 16 (Work distillation and state

formation Horodecki and Oppenheim (2013); Faist and Renner (2018)).

Let $\hat{\rho}$ be a quantum state on a system with a Hamiltonian $\hat{H}$ . Let $\gamma^{\prime\prime}=1$ denote the trivial thermal state on the trivial system $\mathbb{C}$ with the trivial Hamiltonian $\hat{H}^{\prime\prime}=0$ . Then for any $\varepsilon\geqslant 0$ we have

[TABLE]

Consequently, for any $\hat{\rho},\hat{\rho}^{\prime}$ , and for any Hamiltonians $\hat{H},\hat{H}^{\prime}$ ,

[TABLE]

For asymptotic sequences of states $\widehat{P}=\{\hat{\rho}_{n}\}$ , $\widehat{P}^{\prime}=\{\hat{\rho}^{\prime}_{n}\}$ and sequences of Hamiltonians $\widehat{\mathcal{H}}=\{\hat{H}_{n}\}$ , $\widehat{\mathcal{H}}^{\prime}=\{\hat{H}^{\prime}_{n}\}$ , we have

[TABLE]

where we denote by $\widehat{\Sigma}$ (respectively $\widehat{\Sigma}^{\prime}$ ) the sequence $\{e^{-\beta\hat{H}_{n}}\}$ (respectively $\{e^{-\beta\hat{H}^{\prime}_{n}}\}$ ).

Proof.

The statements (212) are proven in Ref. Faist and Renner (2018). The result for semiclassical states and thermal operations was shown in the earlier Ref. Horodecki and Oppenheim (2013). The statement (213) follows directly by combining the processes in (212). To prove (214), observe that for any $\varepsilon>0$ , we have for sufficiently large $n$ that $[{S}_{\infty}^{\varepsilon}(\hat{\rho}_{n}^{\prime}\,\|\,\hat{\sigma}_{n}^{\prime})-{S}_{0}^{\varepsilon}(\hat{\rho}_{n}\,\|\,\hat{\sigma}_{n})]/n\leqslant{\overline{S}}(\widehat{P}^{\prime}\,\|\,\widehat{\Sigma}^{\prime})-{\underline{S}}(\widehat{P}\,\|\,\widehat{\Sigma})+g(\varepsilon)$ where $g(\varepsilon)$ is some function of $\varepsilon$ with $g(\varepsilon)\to 0$ as $\varepsilon\to 0$ . Then (214) follows from (213) and Lemma 13.∎

For completeness, we prove (213) directly with an explicit transformation (see also Theorem 6.3 of Sagawa (2021)).

Alternative direct proof

of (213).

We prove the following equivalent statement: Assuming that ${S}_{0}^{\varepsilon}(\hat{\rho}\,\|\,e^{-\beta\hat{H}})\geqslant{S}_{\infty}^{\varepsilon}(\hat{\rho}^{\prime}\,\|\,e^{-\beta\hat{H}^{\prime}})$ , we explicitly construct a Gibbs-preserving operation that performs the given transformation using a hypothesis test. The equivalence with (213) follows from Proposition 14 (c), the scaling property (13) of the divergences, and their additivity under tensor products (14). Without loss of generality we may assume that $\operatorname{tr}(e^{-\beta\hat{H}})=\operatorname{tr}(e^{-\beta\hat{H}^{\prime}})=1$ ; otherwise, shift the Hamiltonians by suitable constants and apply Proposition 14 (a) whose cost cancels the shift (13). Let $\hat{\sigma}=e^{-\beta\hat{H}}$ and $\hat{\sigma}^{\prime}=e^{-\beta\hat{H}^{\prime}}$ , which are now quantum states.

First, consider the case of $\varepsilon=0$ . We explicitly construct a CPTP map $E$ that maps $(\hat{\rho},\hat{\sigma})$ to $(\hat{\rho}^{\prime},\hat{\sigma}^{\prime})$ , by using a “measure-and-prepare” method. Let $c:=e^{-{S}_{0}(\hat{\rho}\,\|\,\hat{\sigma})}$ , and let $\hat{P}_{\rho}$ be the projection onto the support of $\hat{\rho}$ . If $c=1$ , the situation becomes trivial, because $\hat{\rho}=\hat{\sigma}$ and $\hat{\rho}^{\prime}=\hat{\sigma}^{\prime}$ . If $c\neq 1$ , we can construct the desired CPTP map $E$ as

[TABLE]

where the condition ${S}_{0}(\hat{\rho}\,\|\,\hat{\sigma})\geqslant{S}_{\infty}(\hat{\rho}^{\prime}\,\|\,\hat{\sigma}^{\prime})$ is used to guarantee that $\hat{\sigma}^{\prime}-c\hat{\rho}^{\prime}\geqslant 0$ .

We next consider the case of $\varepsilon>0$ . By definition of the smooth entropies, there exist $\hat{\tau},\hat{\tau}^{\prime}$ such that ${S}_{\infty}^{\varepsilon}(\hat{\rho}^{\prime}\,\|\,\hat{\sigma}^{\prime})={S}_{\infty}(\hat{\tau}^{\prime}\,\|\,\hat{\sigma}^{\prime})$ and ${S}_{\infty}^{\varepsilon}(\hat{\rho}\,\|\,\hat{\sigma})={S}_{\infty}(\hat{\tau}\,\|\,\hat{\sigma})$ , with $D(\hat{\tau},\hat{\rho})\leqslant\varepsilon$ , $D(\hat{\tau}^{\prime},\hat{\rho}^{\prime})\leqslant\varepsilon$ . From the case $\varepsilon=0$ we have that $\hat{\tau}\xrightarrow[\mathrm{GPM}]{}\hat{\tau}^{\prime}$ with respect to the thermal states $\hat{\sigma},\hat{\sigma}^{\prime}$ . By triangle inequality and because quantum operations can only decrease the trace distance, we have that $D(E(\hat{\rho}),\hat{\rho}^{\prime})\leqslant D(E(\hat{\rho}),E(\hat{\tau}))+D(E(\hat{\tau}),\hat{\rho}^{\prime})\leqslant D(\hat{\rho},\hat{\tau})+D(\hat{\tau}^{\prime},\hat{\rho}^{\prime})\leqslant 2\varepsilon$ . Hence $\hat{\rho}\xrightarrow[\mathrm{GPM}]{0,0,2\varepsilon}\hat{\rho}^{\prime}$ . ∎

As an immediate consequence, any state that satisfies ${S}_{0}(\hat{\rho}\,\|\,e^{-\beta\hat{H}})={S}_{\infty}(\hat{\rho}\,\|\,e^{-\beta\hat{H}})$ can be reversibly converted to and from the thermal state $e^{-\beta\hat{H}}/\operatorname{tr}(e^{-\beta\hat{H}})$ with Gibbs-preserving operations. The same holds for thermal operations if the state is semiclassical. Consequently, the common value of the divergences, which we can denote as $S(\hat{\rho})$ , is the thermodynamic potential: It characterizes exactly which state transformations are possible within this class of states.

Appendix 0.C $C^{\ast}$ -algebra formulation

In this appendix, we provide an overview of the standard formulation of ergodicity with $C^{\ast}$ -algebras Bratteli and Robinson (1987, 1981); Ruelle (1999), and prove that it is equivalent to our formulation in Section IV. Furthermore, we prove Theorem 3 in the alternative setting where we consider a sequence of reduced states of the infinite Gibbs state, rather than a sequence of finite Gibbs states corresponding to Hamiltonians truncated to finite regions. In the following, we use the notation of Section IV.

The set of local operators is given by $\mathcal{A}_{\mathrm{loc}}:=\cup_{\Lambda}\mathcal{A}_{\Lambda}$ for a bounded lattice region $\Lambda\subset\mathbb{Z}^{d}$ . Then, the $C^{\ast}$ -algebra $\mathcal{A}$ is defined as the $C^{\ast}$ -inductive limit of $\mathcal{A}_{\mathrm{loc}}$ , which is often written as $\mathcal{A}=\overline{\bigotimes_{i\in\mathbb{Z}^{d}}\mathcal{A}_{i}}$ .

We consider a (normal) state $\Psi:\mathcal{A}\to\mathbb{C}$ , where $\Psi(\hat{A})\in\mathbb{C}$ is interpreted as the expectation value of observable $\hat{A}$ . We consider a reduced state to a bounded region $\Lambda\subset\mathbb{Z}^{d}$ . By definition, the reduced density operator on this region, written as $\hat{\rho}_{\Lambda}$ , satisfies

[TABLE]

for all $\hat{A}\in\mathcal{A}_{\Lambda}$ . We note that the consistency condition (138) is automatically satisfied for this $\{\hat{\rho}_{\Lambda}\}$ .

By using the shift superoparator $T_{i}$ introduced in Section IV.2, we first define translation invariance.

Definition 11 (Translation invariance).

A state $\Psi$ is translation invariant, if for all $\hat{A}\in\mathcal{A}$ and for all $i\in\mathbb{Z}^{d}$ ,

[TABLE]

The above definition of translation invariance is equivalent to the definition in Section IV.2; this is guaranteed by the following lemma, which states that it is sufficient to take $\hat{A}$ above to be local.

Lemma 14.

If Eq. 217 is satisfied for all $\hat{A}\in\mathcal{A}_{\mathrm{loc}}$ and all $i\in\mathbb{Z}^{d}$ , then $\Psi$ is translation invariant.

Proof.

Suppose that Eq. 217 is satisfied for all $\hat{A}\in\mathcal{A}_{\mathrm{loc}}$ . For any $\hat{A}\in\mathcal{A}$ , there exists a sequence $\{\hat{A}_{m}\}_{m\in\mathbb{N}}\subset\mathcal{A}_{\mathrm{loc}}$ such that $\hat{A}_{m}\in\mathcal{A}_{\Lambda_{m}}$ and $\lim_{m\to\infty}\lVert{\hat{A}-\hat{A}_{m}}\rVert_{\infty}=0$ . Let $\hat{\Delta}_{m}:=\hat{A}-\hat{A}_{m}$ . Then we have

[TABLE]

The first term on the right-hand side vanishes. The second term is bounded as

[TABLE]

which goes to zero as $m\to\infty$ . ∎

We now define ergodicity in a more standard and mathematically elegant way Bratteli and Robinson (1987); Ruelle (1999) (see also Refs. Bjelaković et al. (2004); Bjelaković and Szkola (2005)).

Definition 12 (Ergodicity).

A state $\Psi$ is translation-invariant and ergodic, if it is an extremal point of the set of translation-invariant states.

Physically, an ergodic state corresponds to a “pure thermodynamic phase” without phase mixture, which is consistent with this mathematical definition.

The following theorem establishes the equivalence of the definition above with the definition presented in Section IV.2. This is a reformulation of Theorem 6.3.3, Proposition 6.3.5, and Lemma 6.5.1 of Ref. Ruelle (1999); see also Ref. Bjelaković and Szkola (2005).

Lemma 15.

Using the notation of Section IV, the following are equivalent for any translation-invariant state $\Psi$ :

(a)

$\Psi$ * is ergodic;* 2. (b)

For all self-adjoint $\hat{A}\in\mathcal{A}$ ,

[TABLE] 3. (c)

For all $\hat{A},\hat{B}\in\mathcal{A}$ ,

[TABLE] 4. (d)

Equation 220* is satisfied for all self-adjoint $\hat{A}\in\mathcal{A}_{\mathrm{loc}}$ .*

For completeness, we prove the equivalence of (d) with the other points.

Proof.

It suffices to check that (d) $\,\Rightarrow\,$ (b). The proof is similar to that of Lemma 14, and we use the same notation: For any $\hat{A}\in\mathcal{A}$ , there exists a sequence $\{\hat{A}_{m}\}_{m\in\mathbb{N}}\subset\mathcal{A}_{\mathrm{loc}}$ such that $\hat{A}_{m}\in\mathcal{A}_{\Lambda_{m}}$ and $\lim_{m\to\infty}\lVert{\hat{A}-\hat{A}_{m}}\rVert_{\infty}=0$ ; let $\hat{\Delta}_{m}:=\hat{A}-\hat{A}_{m}$ . Now suppose that Eq. 220 is satisfied for all self-adjoint $\hat{A}\in\mathcal{A}_{\mathrm{loc}}$ . We first note that

[TABLE]

We then have

[TABLE]

From Eq. 220 for $\hat{A}_{m}\in\mathcal{A}_{\mathrm{loc}}$ , we have, for a fixed $m$ ,

[TABLE]

Since $m$ can be taken arbitrarily large, the right-hand side above can be arbitrarily small. Therefore, Eq. 220 is satisfied for all $\hat{A}\in\mathcal{A}$ . ∎

We now provide a definition of mixing that is suited to the formalism in this section.

Definition 13 (Mixing).

Let $T_{(k)}$ be the shift operator in Definition 10 in Section IV. A state $\Psi$ has the mixing property, if for all $\hat{A},\hat{B}\in\mathcal{A}$ and all $k$ ,

[TABLE]

Definition 14 (Weak mixing).

A state $\Psi$ has the weak mixing property, if for all $\hat{A},\hat{B}\in\mathcal{A}$ ,

[TABLE]

Mixing implies weak mixing, and weak mixing implies ergodicity. However, the converses of them are not true. In particular, the weak mixing in the above sense should not be confused with Eq. 221.

The following lemma guarantees that the above definition of mixing is equivalent to Definition 10 in Section IV.

Lemma 16.

In the definitions of mixing and weak mixing above, it is sufficient to take $\hat{A},\hat{B}\in\mathcal{A}_{\mathrm{loc}}$ .

Proof.

The proof of (d) $\Rightarrow$ (b) in Lemma 15 provided above can be straightforwardly adapted to prove this lemma. ∎

We next consider the concept of local Gibbs states for the infinite-dimensional setup Bratteli and Robinson (1981). We here assume that the Kubo-Martin-Schwinger (KMS) state is unique at $\beta$ , which physically implies no phase coexistence. This is provable for any $\beta>0$ in one dimension Araki (1975), but is true at a sufficiently high temperature in higher dimensions Bratteli and Robinson (1981).

Let $\varphi^{\Box}_{\Lambda}:\mathcal{A}\to\mathbb{C}$ be the Gibbs state corresponding to the truncated Hamiltonian associated with the region $\Lambda$ , and represented by the density operator $\hat{\sigma}^{\Box}_{\Lambda}$ in Eq. 143 of Section IV.2. Then, it is known that a state

[TABLE]

exists, where the limit is given by the weak- $\ast$ (or ultraweak) topology of the dual of $\mathcal{A}$ (cf. Proposition 6.2.15 of Ref. Bratteli and Robinson (1981)). We can then define the global Gibbs state on the entire lattice by $\Phi$ . This global state satisfies the following condition for any $\hat{A}\in\mathcal{A}_{\mathrm{loc}}$ ,

[TABLE]

Then, we define the reduced state of $\Phi$ on a bounded region $\Lambda$ , which is written as $\varphi_{\Lambda}$ . Let $\hat{\sigma}_{\Lambda}$ be the corresponding reduced density operator. For any observable $\hat{A}\in\mathcal{A}_{\Lambda}$ , we have

[TABLE]

In the following, let $\widehat{\Sigma}:=\{\hat{\sigma}_{n}\}$ be the sequence of the reduced Gibbs states, where $\hat{\sigma}_{n}:=\hat{\sigma}_{\Lambda_{\ell}}$ and $n=(2\ell+1)^{d}$ . We note that the reduced state $\hat{\sigma}_{\Lambda}$ and the truncated state $\hat{\sigma}^{\Box}_{\Lambda}$ are different in general, where only $\hat{\sigma}_{\Lambda}$ satisfies the consistency condition (138).

We now prove another version of Theorem 3 in Section IV, where $\widehat{\Sigma}$ is the sequence of reduced states of the full Gibbs state on the infinite lattice, instead of the sequence $\widehat{\Sigma}^{\Box}$ of Gibbs states corresponding to truncated Hamiltonians associated with a sequence of finite regions.

Our proof strategy is to show that the asymptotic min divergence rate, the max divergence rate and the KL divergence rate remain unchanged if we substitute $\widehat{\Sigma}^{\Box}$ by $\widehat{\Sigma}$ . For this, we invoke the following result, given as Theorem 3.11 in Ref. Lenci and Rey-Bellet (2005) (see in particular the second proof provided in that reference, which holds for observables that are not necessarily positive and proves the uniformity of the convergence).

Proposition 17 (Lenci and Rey-Bellet (Lenci and Rey-Bellet, 2005, Theorem 3.11)).

Suppose that the KMS state is unique. For any observable $\hat{A}_{\Lambda}\in\mathcal{A}_{\Lambda}$ for a bounded region $\Lambda\subset\mathbb{Z}^{d}$ , we have

[TABLE]

where the convergence is uniform in $\hat{A}_{\Lambda}$ .

The above result allows us to prove that the KL divergence rate does not change if we replace the Gibbs state of the truncated Hamiltonian by the reduced state of the infinite Gibbs state.

Lemma 17.

Suppose that the KMS state is unique and that ${S}_{1}(\widehat{P}\,\|\,\widehat{\Sigma}^{\Box})$ exists. Then ${S}_{1}(\widehat{P}\,\|\,\widehat{\Sigma})$ exists and equals ${S}_{1}(\widehat{P}\,\|\,\widehat{\Sigma}^{\Box})$ .

Proof.

Proposition 17 implies that

[TABLE]

which implies ${S}_{1}(\widehat{P}\,\|\,\widehat{\Sigma})={S}_{1}(\widehat{P}\,\|\,\widehat{\Sigma}^{\Box})$ . ∎

Similarly, we may use Proposition 17 to show that the min and max divergence rates (via the hypothesis testing divergence rate) remain unchanged if we replace $\widehat{\Sigma}^{\Box}$ by $\widehat{\Sigma}$ .

Lemma 18.

Suppose that the KMS state is unique and that ${S}_{\mathrm{H}}^{\eta}(\widehat{P}\,\|\,\widehat{\Sigma}^{\Box})$ exists for any $0<\eta<1$ . Then, for any $0<\eta<1$ , the rate ${S}_{\mathrm{H}}^{\eta}(\widehat{P}\,\|\,\widehat{\Sigma})$ exists and equals ${S}_{\mathrm{H}}^{\eta}(\widehat{P}\,\|\,\widehat{\Sigma}^{\Box})$ .

Proof.

From Eq. 230 in Proposition 17, there exists $\delta_{n}>0$ satisfying $\lim_{n\to\infty}\frac{\delta_{n}}{n}=0$ such that for any $\hat{A}_{n}\in\mathcal{A}_{\Lambda_{\ell}}$ ,

[TABLE]

Combined with Eq. 18, this implies that

[TABLE]

The claim follows by dividing this equation by $n$ and taking the limit $n\to\infty$ . ∎

It is now straightforward to combine Lemmas 17 and 18 to prove another version of Theorem 3 for the infinite Gibbs state, rather than the limit of Gibbs states of the truncated Hamiltonian of increasingly large finite regions.

Theorem 4 (Collapse of the spectral rates for the reduced Gibbs state).

Suppose that $\widehat{P}$ is translation invariant and ergodic, and $\widehat{\Sigma}$ is the reduced Gibbs state of a local and translation invariant Hamiltonian in any dimensions, where the KMS state is unique. Then, for any $0<\eta<1$ ,

[TABLE]

and as a consequence,

[TABLE]

Appendix 0.D An alternative proof of Theorem 4

Here we provide an alternative proof of Theorem 4 presented above, in the case of a one-dimensional chain, by combining a known result by Hiai, Mosonyi, and Ogawa Hiai et al. (2007) with the ergodic theorem of Bjelaković Bjelakovic and Siegmund-Schultze (2004). We state these results here:

Proposition 18 (Hiai, Mosonyi, and Ogawa (Hiai et al., 2007, Lemma 4.2)).

Let $\hat{\sigma}_{n}$ be the reduced local Gibbs state on $n$ sites in one dimension. There exist $\alpha_{1},\alpha_{2}>0$ and $m_{0}\in\mathbb{N}$ such that for all $m\geqslant m_{0}$ and $k\in\mathbb{N}$ we have

[TABLE]

Proposition 19 (Bjelaković and

Siegmund-Schultze (Bjelakovic and Siegmund-Schultze, 2004, Theorem 2.1)).

Suppose that $\widehat{P}$ is translation-invariant and ergodic, and $\widehat{\Sigma}=\{\sigma^{\otimes n}\}$ is i.i.d. Then, for any $0<\eta<1$ ,

[TABLE]

The proof strategy is thus to use Proposition 18 to reduce the problem for a local Gibbs state to a problem with a tensor product Gibbs state, by coarse-graining the $n$ -site chain into $k$ blocks of $m$ sites. The problem then falls in the scope of Proposition 19 which gives the desired result.

Alternative proof of Theorem 4 in one

dimension..

We fix $m\in\mathbb{N}$ , and let $n=km+r$ with $1\leqslant r\leqslant m-1$ . First we argue that we can essentially ignore the $r$ remaining sites and focus on the $km$ sites. From the monotonicity of the hypothesis testing divergence under CPTP maps, and therefore under the partial trace, we have for any $0<\eta<1$ ,

[TABLE]

Fix $0<\eta<1$ and let $\hat{Q}_{km}$ denote an optimal operator in (18) such that $\eta^{-1}\operatorname{tr}\bigl{(}\hat{Q}_{km}\hat{\sigma}_{km}\bigr{)}=\exp\bigl{(}-{S}_{\mathrm{H}}^{\eta}(\hat{\rho}_{km}\,\|\,\hat{\sigma}_{m}^{\otimes k})\bigr{)}$ . Then, from Proposition 18,

[TABLE]

Therefore,

[TABLE]

From Proposition 19, we have for large $k$ and at fixed $m$ ,

[TABLE]

where $\lim_{k\to\infty}\delta_{k}=0$ . Using the fact that the logarithm is an operator monotone and with (236),

[TABLE]

Hence, we obtain

[TABLE]

Taking $\liminf_{n\to\infty}$ while fixing $m$ , we obtain

[TABLE]

where we used Lemma 17 to get the first term on the right-hand side. Since $m$ can be taken arbitrarily large, we obtain

[TABLE]

We next show the opposite direction. Again from the monotonicity of the hypothesis testing divergence under partial trace,

[TABLE]

Fix $0<\eta<1$ and let $\hat{Q}^{\prime}_{(k+1)m}$ denote an optimal operator in (18) such that $\eta^{-1}\operatorname{tr}\bigl{(}\hat{Q}_{(k+1)m}\hat{\sigma}_{(k+1)m}\bigr{)}=\exp\bigl{(}-{S}_{\mathrm{H}}^{\eta}(\hat{\rho}_{(k+1)m}\,\|\,\hat{\sigma}_{(k+1)m})\bigr{)}$ . Then, using Proposition 18,

[TABLE]

Therefore,

[TABLE]

From Proposition 19, we have for large $k$ and for fixed $m$ ,

[TABLE]

where $\lim_{k\to\infty}\delta^{\prime}_{k}=0$ . Since the logarithm is an operator monotone, we have from inequality (236),

[TABLE]

Therefore, we obtain

[TABLE]

By taking $\limsup_{n\to\infty}$ while fixing $m$ , we obtain

[TABLE]

where we again used Lemma 17. Since $m$ can be taken arbitrarily large, we obtain

[TABLE]

Equation 234 then follows from inequalities (245) and (253). ∎

Appendix 0.E The classical case

If we restrict the $C^{\ast}$ -algebra $\mathcal{A}$ in one dimension to a commutative subalgebra, we obtain a classical stochastic process. Here, we flesh out explicitly the classical ergodic theorem that our argument in Section IV and Appendix 0.C reduces to in the classical case.

The classical counterpart of the setup in these sections is a two-sided stochastic process over $\mathbb{Z}$ with finite alphabets. Let $\{x_{\ell}\}_{\ell\in\mathbb{Z}}$ be the stochastic process, where $x_{l}\in B$ with $B$ being a finite set of alphabets, and let $X_{n}:=(x_{-\ell},x_{-\ell+1},\cdots,x_{\ell})$ with $n:=2\ell+1$ . We consider sequences of probability distributions $\widehat{P}:=\{\rho_{n}(X_{n})\}_{n\in\mathbb{N}}$ and $\widehat{\Sigma}:=\{\sigma_{n}(X_{n})\}_{n\in\mathbb{N}}$ .

First of all, we briefly comment on mathematical details about the correspondence between the classical case and the quantum case (see also Refs. Bjelaković et al. (2004); Bjelakovic and Siegmund-Schultze (2004)). Let $\mathcal{A}$ be the $C^{\ast}$ -algebra of an infinite spin chain. We consider a unital Abelian $C^{\ast}$ -subalgebra $\mathcal{B}\subset\mathcal{A}$ , which is interpreted as a set of classical observables. Let $\Phi$ be a quantum state on $\mathcal{A}$ , and $\Phi|_{\mathcal{B}}$ be its restriction to $\mathcal{B}$ . From the Gelfand-Naimark theorem, $\mathcal{B}$ is identified with the Banach space $C_{0}(K)$ , which is the space of $\mathbb{C}$ -valued continuous functions on a compact Hausdorff space $K$ . In our setup, $K=B^{\mathbb{Z}}$ , which is compact from the Tychonoff’s theorem. From the Riesz-Markov-Kakutani representation theorem, the dual of $C_{0}(K)$ is the space of regular Borel measures on $K$ . Thus $\Phi|_{\mathcal{B}}$ is identified with a probability measure on $K$ (i.e., a stochastic process over $\mathbb{Z}$ ).

Classical ergodicity can be defined in the same manner as in the quantum case (Definition 9), i.e., as a commutative case of quantum ergodicity. On the other hand, the standard definition of classical ergodicity is that any subset of trajectories in a stochastic process that is invariant under $T$ has measure [math] or $1$ . These definitions are equivalent for the finite-alphabet case. In fact, a classical stochastic process is translation-invariant ergodic if and only if it is an extremal point of the set of translation-invariant processes. Also, as mentioned before, Definition 9 is equivalent to the definition by extremality for quantum spin systems Ruelle (1999); Bjelaković and Szkola (2005).

All the quantum divergences introduced in Section II can be computed using as arguments a probability distribution and a vector of positive entries of same length, by embedding both classical vectors into the diagonal entries of an operator in a Hilbert space whose dimension is the same as the number of entries in the vectors.

In the following, we argue that, explicitly for the classical case, if $\widehat{P}$ and $\widehat{\Sigma}$ satisfy a relative asymptotic equipartition property (relative AEP), then the lower and the upper divergence rates coincide, and they must equal the KL divergence. We first define the relative AEP in the form of a convergence in probability, a classical counterpart of our quantum formulation in Section IV.

Definition 15 (Relative asymptotic equipartition property (relative AEP)).

We say that $\widehat{P}$ and $\widehat{\Sigma}$ satisfy the relative AEP if the KL divergence rate ${S}_{1}(\widehat{P}\,\|\,\widehat{\Sigma})$ exists and if $\frac{1}{n}\ln\frac{\rho_{n}(X_{n})}{\sigma_{n}(X_{n})}$ converges to ${S}_{1}(\widehat{P}\,\|\,\widehat{\Sigma})$ in probability by sampling $X_{n}$ according to $\rho_{n}$ .

This is equivalently formulated as follows (see, for example, Theorem 11.8.2 of Ref. Cover and Thomas (2006)):

Proposition 20.

Suppose that ${S}_{1}(\widehat{P}\,\|\,\widehat{\Sigma})$ exists. The sequences $\widehat{P}$ and $\widehat{\Sigma}$ satisfy the relative AEP if and only if for any $\varepsilon>0$ , there exists a set $Q_{n}\subset\{X_{n}\}$ (the relative typical set) such that for sufficiently large $n$ :

(a)

For any $X_{n}\in Q_{n}$ ,

[TABLE] 2. (b)

$\rho_{n}[Q_{n}]>1-\varepsilon$ ; and 3. (c)

$(1-\varepsilon)\exp(-n({S}_{1}(\widehat{P}\,\|\,\widehat{\Sigma})+\varepsilon))<\sigma_{n}[Q_{n}]<\exp(-n({S}_{1}(\widehat{P}\,\|\,\widehat{\Sigma})-\varepsilon))$ .

Here, $\rho_{n}[Q_{n}]$ and $\sigma_{n}[Q_{n}]$ represent the probability of $Q_{n}$ according to distributions $\rho_{n}$ and $\sigma_{n}$ , respectively.

The relative AEP ensures that the min and max divergence rates converge to the KL divergence rate:

Proposition 21.

If $\widehat{P}$ and $\widehat{\Sigma}$ satisfy the relative AEP, we have

[TABLE]

Proof.

Although this proposition follows easily from Eqs. 29a and 29b, we here note an alternative proof based on Definition 2 with a slightly different intuition. We consider a subnormalized probability distribution $\tau_{n}(X_{n})$ defined by $\tau_{n}(X_{n}):=\rho_{n}(X_{n})$ for $X_{n}\in Q_{n}$ and $\tau_{n}(X_{n}):=0$ for $X_{n}\notin Q_{n}$ . From $\operatorname{tr}[\tau_{n}]>1-\varepsilon$ and with Proposition 10, we see that $\tau_{n}$ is a candidate for the maximization in ${S}_{0}^{{2\sqrt{\varepsilon}}}(\rho_{n}\,\|\,\sigma_{n})$ . Therefore,

[TABLE]

where we used that $Q_{n}$ cannot be smaller than the support of $\tau_{n}$ to obtain the right inequality. From the right inequality of (c) in Proposition 20, we have

[TABLE]

By taking the limit $n\to\infty$ , we obtain

[TABLE]

Similarly, we have

[TABLE]

From the right hand side of Proposition 20 (a), we have

[TABLE]

By taking the limit, we obtain

[TABLE]

By combining Eqs. 258 and 261, we obtain (255). ∎

In the following, we assume that $\widehat{P}:=\{\rho_{n}\}$ is translation-invariant (i.e., stationary) and ergodic. In this case the non-relative AEP (i.e., the classical counterpart of Proposition 8) is satisfied, as a consequence of the Shannon-McMillan theorem.

As in the quantum case, we define the reduced state $\widehat{\Sigma}:=\{\sigma_{n}\}$ of the global Gibbs state $\sigma$ of a local and translation-invariant Hamiltonian in one dimension, where $\sigma_{n}(X_{n}):=\sigma(X_{n})$ (i.e., $\sigma_{n}$ is a marginal distribution of $\sigma$ ). We can also define the truncated Gibbs state $\widehat{\Sigma}^{\Box}:=\{\sigma^{\Box}_{n}\}$ . The global Gibbs state $\sigma$ is obtained as the limit of the truncated Gibbs states Ruelle (1968):

[TABLE]

where convergence is given by the weak- $\ast$ topology (or the vague topology) of the dual of the Banach space $C_{0}(K)$ .

We remark that the case of the reduced Gibbs state $\widehat{\Sigma}$ can also be obtained from a well-known fact that the relative AEP is satisfied for a translation-invariant ergodic process with respect to a translation-invariant Markov process. (The relative AEP has also been proved in a stronger sense (i.e., almost surely convergence). See Ref. Algoet and Cover (1988) and references therein. For our purpose here, however, convergence in probability is enough.) In fact, we have the following lemma.

Lemma 19.

The global Gibbs state $\sigma$ of a local and translation-invariant Hamiltonian in one dimension is translation-invariant Markovian.

Proof.

From the Hammersley-Clifford theorem Hammersley and Clifford (1971) (see also Ref. Kato and Brandão (2019)), it is known that the Gibbs state of a local Hamiltonian on an arbitrary finite graph is Markovian. On the other hand, here we directly prove this lemma by explicitly calculating the global Gibbs distribution $\sigma$ , without using the Hammersley-Clifford theorem.

For simplicity, we assume that the local interaction is given in the form of $h_{i}=h(x_{i},x_{i+1})$ and satisfies $h(x,y)=h(y,x)$ . We introduce the transfer matrix $T$ , whose $(x_{i},x_{i+1})$ -element is given by

[TABLE]

Here, we used the bra-ket notation to represent the classical probability vectors. We denote the spectral decomposition of $T$ as

[TABLE]

We also assume that $T$ has a non-degenerate maximum eigenvalue $e^{\lambda_{\ast}}$ .

For the truncated Hamiltonian $H_{[-\ell,\ell]}:=\sum_{i=-\ell}^{\ell}h(x_{i},x_{i+1})$ , the truncated Gibbs distribution is given by

[TABLE]

where $\lvert{1}\rangle:=\sum_{x_{i}}\lvert{x_{i}}\rangle$ is the column vector whose entries are all unity. Its marginal distribution for an interval $[-\ell^{\prime},n]$ with $\ell^{\prime}<\ell$ , $n<\ell$ is given by

[TABLE]

The conditional probability is then given by

[TABLE]

which depends only on $x_{n-1}$ and $x_{n}$ — as expected from the Hammersley-Clifford theorem — with also an explicit dependency on $n$ . From (264),

[TABLE]

By taking the limit of $\ell$ while fixing $\ell^{\prime}$ and $n$ , we obtain

[TABLE]

where the right-hand side depends only on $x_{n}$ and $x_{n+1}$ and no longer explicitly depends on $n$ . Therefore, the global Gibbs distribution $\sigma$ satisfies

[TABLE]

We note that it is straightforward to remove the assumption that $T$ has a non-degenerate maximum eigenvalue. In fact, we can just replace the right-hand side of (269) by multiple eigenvectors with the maximum eigenvalue of $T$ .

In general, a stochastic process $\sigma$ is defined as Markovian, if for any $n$

[TABLE]

holds almost surely (see, for example, Chapter 2 of Ref. Doob (1990)). Also, from the Levy’s martingale convergence theorem, $\lim_{\ell^{\prime}\to\infty}\sigma(\mathopen{}x_{n}\mathclose{}\,|\,\mathopen{}x_{n-1},\cdots,x_{-\ell^{\prime}}\mathclose{})=\sigma(\mathopen{}x_{n}\mathclose{}\,|\,\mathopen{}x_{n-1},\cdots\mathclose{})$ holds almost surely. The claim then follows from Eq. 270. ∎

Bibliography95

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Callen (1985) H. B. Callen, Thermodynamics and an Introduction to Thermostatistics (Wiley, 1985).
2Lieb and Yngvason (1999) E. H. Lieb and J. Yngvason, The physics and mathematics of the second law of thermodynamics, Physics Reports 310 , 1 (1999) , ar Xiv:cond-mat/9708200 . · doi ↗
3Sagawa (2012) T. Sagawa, Second law-like inequalities with quantum relative entropy: An introduction, in Lectures on Quantum Computing, Thermodynamics and Statistical Physics , edited by M. Nakahara and S. Tanaka (World Scientific, 2012) pp. 125–190, ar Xiv:1202.0983 . · doi ↗
4Parrondo et al. (2015) J. M. R. Parrondo, J. M. Horowitz, and T. Sagawa, Thermodynamics of information, Nature Physics 11 , 131 (2015) , ar Xiv:0903.2792 . · doi ↗
5Goold et al. (2016) J. Goold, M. Huber, A. Riera, L. d. Rio, and P. Skrzypczyk, The role of quantum information in thermodynamics—a topical review, Journal of Physics A: Mathematical and Theoretical 49 , 143001 (2016) , ar Xiv:1505.07835 . · doi ↗
6Chitambar and Gour (2019) E. Chitambar and G. Gour, Quantum resource theories, Reviews of Modern Physics 91 , 025001 (2019) , ar Xiv:1806.06107 . · doi ↗
7Sagawa (2021) T. Sagawa, Entropy, Divergence, and Majorization in Classical and Quantum Thermodynamics (Springer Briefs in Mathematical Physics, 2021) in press, ar Xiv:2007.09974 .
8Brandão et al. (2013) F. G. S. L. Brandão, M. Horodecki, J. Oppenheim, J. M. Renes, and R. W. Spekkens, Resource theory of quantum states out of thermal equilibrium, Physical Review Letters 111 , 250404 (2013) , ar Xiv:1111.3882 . · doi ↗

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Asymptotic Reversibility of Thermal Operations for Interacting Quantum Spin Systems via Generalized Quantum Stein’s Lemma

Abstract

Contents

I Introduction

Main result

Structure of the paper

II Preliminaries

II.1 Entropy and divergence

Definition 1** (Smooth divergences Datta (2009)).**

Proposition 1**.**

Proof.

Proposition 2**.**

Proof.

II.2 Asymptotic spectral divergence rates

Definition 2** (Spectral divergence rates).**

III Asymptotic state convertibility by thermal operations

III.1 Thermodynamic operations

Definition 3** ((Generalized) thermal Operation).**

Definition 4** (Gibbs-sub-preserving map).**

Lemma 1**.**

Proof.

Proposition 3** (Monotonicity of divergences Horodecki and Oppenheim (2013); Brandão et al. (2015); Faist and Renner (2018); Tomamichel (2016)).**

Definition 5** (Work/coherence-assisted process).**

Definition 6** (Approximate thermodynamic process using work and coherence).**

Proposition 4** **(Quasi-monotonicity of the hypothesis testing divergence

Definition 7** (Asymptotic thermodynamic process).**

Proposition 5** (Monotonicity of spectral rates Bowen and Datta (2006a)).**

Proof.

Proposition 6**.**

III.2 State convertibility by thermal operations

Theorem 1**.**

Theorem 2**.**

III.3 Proof of Theorems 1 and 2

III.3.1 Discretizing the Hamiltonian

Proposition 7**.**

Proof.

III.3.2 Manipulating coherence in the state

Lemma 2**.**

Proof.

Lemma 3**.**

Proof.

III.3.3 Collapse of the min and max divergences suppresses coherence

Lemma 4**.**

Proof.

III.3.4 Proof of Theorem 1

Proof.

III.3.5 Proof of Theorem 2

Proof.

IV Collapse of the min and max divergences for ergodic states relative

IV.1 A sufficient condition for quantum Stein’s lemma

Lemma 5**.**

Proof of Lemma 5.

IV.2 Formulation of ergodic states and local Gibbs states

Definition 8** (Translation invariance).**

Definition 9** (Ergodicity).**

IV.3 Generalized Stein’s lemma for ergodic states relative to

Lemma 6**.**

Proof.

Proposition 8** (Quantum Shannon-McMillan Theorem).**

Theorem 3** (Collapse of the spectral rates for the truncated Gibbs state).**

Proof.

IV.4 Remarks on ergodicity, mixtures, and the

IV.4.1 The mixing property

Definition 10** (Mixing).**

Proposition 9**.**

IV.4.2 Mixtures of ergodic states

Lemma 7**.**

Proof.

IV.4.3 The role of the KL divergence for the thermodynamic

V Discussion

Thermal operations involving nonsemiclassical states.

Asymptotic Equipartition, the Shannon-McMillan theorem, and Stein’s lemma

Acknowledgements.

Appendix 0.A General technical lemmas

Definition 1 (Smooth divergences Datta (2009)).

Proposition 1.

Proposition 2.

Definition 2 (Spectral divergence rates).

Definition 3 ((Generalized) thermal Operation).

Definition 4 (Gibbs-sub-preserving map).

Lemma 1.

Proposition 3 (Monotonicity of divergences Horodecki and Oppenheim (2013); Brandão et al. (2015); Faist and Renner (2018); Tomamichel (2016)).

Definition 5 (Work/coherence-assisted process).

Definition 6 (Approximate thermodynamic process using work and coherence).

Proposition 4 (Quasi-monotonicity of the hypothesis testing divergence

Definition 7 (Asymptotic thermodynamic process).

Proposition 5 (Monotonicity of spectral rates Bowen and Datta (2006a)).

Proposition 6.

Theorem 1.

Theorem 2.

Proposition 7.

Lemma 2.

Lemma 3.

Lemma 4.

Lemma 5.

Definition 8 (Translation invariance).

Definition 9 (Ergodicity).

Lemma 6.

Proposition 8 (Quantum Shannon-McMillan Theorem).

Theorem 3 (Collapse of the spectral rates for the truncated Gibbs state).

Definition 10 (Mixing).

Proposition 9.

Lemma 7.

Proposition 10.

Lemma 8.

Lemma 9.

Proposition 11.

Lemma 10.

Lemma 11.

Lemma 12.

Proposition 12.

Proposition 13 (Dilation of a partial energy-conserving isometry).

Proposition 14 (Elementary properties of thermodynamic operations).

Lemma 13.

Proposition 15 (Work distillation and state formation for semiclassical

Proposition 16 (Work distillation and state

Appendix 0.C $C^{\ast}$ -algebra formulation

Definition 11 (Translation invariance).

Lemma 14.

Definition 12 (Ergodicity).

Lemma 15.

Definition 13 (Mixing).

Definition 14 (Weak mixing).

Lemma 16.

Proposition 17 (Lenci and Rey-Bellet (Lenci and Rey-Bellet, 2005, Theorem 3.11)).

Lemma 17.

Lemma 18.

Theorem 4 (Collapse of the spectral rates for the reduced Gibbs state).

Proposition 18 (Hiai, Mosonyi, and Ogawa (Hiai et al., 2007, Lemma 4.2)).

Proposition 19 (Bjelaković and

Definition 15 (Relative asymptotic equipartition property (relative AEP)).

Proposition 20.

Proposition 21.

Lemma 19.