A minimax approach to one-shot entropy inequalities

Anurag Anshu; Mario Berta; Rahul Jain; Marco Tomamichel

arXiv:1906.00333·quant-ph·August 24, 2020

A minimax approach to one-shot entropy inequalities

Anurag Anshu, Mario Berta, Rahul Jain, Marco Tomamichel

PDF

TL;DR

This paper introduces a minimax approach to derive tighter inequalities among one-shot entropic quantities in quantum information theory, simplifying complex quantum problems to commutative cases and potentially advancing quantum Shannon theory.

Contribution

It presents a novel minimax method that simplifies quantum entropy inequalities to commutative cases, leading to tighter bounds and new insights.

Findings

01

Derived tighter entropy inequalities using the minimax approach

02

Simplified quantum problems to commutative cases for easier analysis

03

Applied method to a joint smoothing problem in quantum Shannon theory

Abstract

One-shot information theory entertains a plethora of entropic quantities, such as the smooth max-divergence, hypothesis testing divergence and information spectrum divergence, that characterize various operational tasks and are used to prove the asymptotic behavior of various tasks in quantum information theory. Tight inequalities between these quantities are thus of immediate interest. In this note we use a minimax approach (appearing previously for example in the proofs of the quantum substate theorem), to simplify the quantum problem to a commutative one, which allows us to derive such inequalities. Our derivations are conceptually different from previous arguments and in some cases lead to tighter relations. We hope that the approach discussed here can lead to progress in open problems in quantum Shannon theory, and exemplify this by applying it to a simple case of the joint…

Equations92

B_{T}^{ε} (ρ) := {\tilde{ρ} \in S : T (ρ, \tilde{ρ}) \leq ε} .

B_{T}^{ε} (ρ) := {\tilde{ρ} \in S : T (ρ, \tilde{ρ}) \leq ε} .

B_{P}^{ε} (ρ) := {\tilde{ρ} \in S_{∙} : P (ρ, \tilde{ρ}) \leq ε},

B_{P}^{ε} (ρ) := {\tilde{ρ} \in S_{∙} : P (ρ, \tilde{ρ}) \leq ε},

D_{m a x} (ρ ∥ σ) := in f {λ \in R : ρ \leq 2^{λ} σ} .

D_{m a x} (ρ ∥ σ) := in f {λ \in R : ρ \leq 2^{λ} σ} .

D_{m a x}^{ε} (ρ ∥ σ) := \tilde{ρ} \in B^{ε} (ρ) in f D_{m a x} (\tilde{ρ} ∥ σ) .

D_{m a x}^{ε} (ρ ∥ σ) := \tilde{ρ} \in B^{ε} (ρ) in f D_{m a x} (\tilde{ρ} ∥ σ) .

D_{α} (ρ ∥ σ) := \frac{1}{α - 1} lo g \frac{tr ( σ ^{\frac{1 - α}{2 α}} ρ σ ^{\frac{1 - α}{2 α}} ) ^{α}}{tr ρ} .

D_{α} (ρ ∥ σ) := \frac{1}{α - 1} lo g \frac{tr ( σ ^{\frac{1 - α}{2 α}} ρ σ ^{\frac{1 - α}{2 α}} ) ^{α}}{tr ρ} .

D (ρ ∥ σ) := \frac{1}{tr ρ} tr ρ (lo g ρ - lo g σ) .

D (ρ ∥ σ) := \frac{1}{tr ρ} tr ρ (lo g ρ - lo g σ) .

D_{h}^{ε} (ρ ∥ σ) := - lo g tr Λ ρ \geq 1 - ε 0 \leq Λ \leq 1 sup tr Λ σ .

D_{h}^{ε} (ρ ∥ σ) := - lo g tr Λ ρ \geq 1 - ε 0 \leq Λ \leq 1 sup tr Λ σ .

D_{s}^{ε} (ρ ∥ σ) := sup {λ \in R : tr ρ {2^{λ} σ - ρ}_{+} \leq ε} .

D_{s}^{ε} (ρ ∥ σ) := sup {λ \in R : tr ρ {2^{λ} σ - ρ}_{+} \leq ε} .

\displaystyle D_{s}^{\varepsilon}(P\|Q):=\sup\left\{\lambda\in\mathbb{R}:\Pr_{P}\bigg{[}\log\frac{P}{Q}\leq\lambda\bigg{]}\leq\varepsilon\right\}\,.

\displaystyle D_{s}^{\varepsilon}(P\|Q):=\sup\left\{\lambda\in\mathbb{R}:\Pr_{P}\bigg{[}\log\frac{P}{Q}\leq\lambda\bigg{]}\leq\varepsilon\right\}\,.

D_{s}^{ε} (ρ ∥ σ) \leq D_{h}^{ε} (ρ ∥ σ) \leq D_{s}^{ε + δ} (ρ ∥ σ) + lo g \frac{1}{δ} .

D_{s}^{ε} (ρ ∥ σ) \leq D_{h}^{ε} (ρ ∥ σ) \leq D_{s}^{ε + δ} (ρ ∥ σ) + lo g \frac{1}{δ} .

P (ρ, \tilde{ρ}) = tr P ρ for \tilde{ρ} = \frac{( 1 - P ) ρ ( 1 - P )}{1 - tr P ρ} .

P (ρ, \tilde{ρ}) = tr P ρ for \tilde{ρ} = \frac{( 1 - P ) ρ ( 1 - P )}{1 - tr P ρ} .

D_{h}^{ε} (ρ ∥ σ) \geq D_{h}^{ε} (E (ρ) ∥ E (σ)), D_{α} (ρ ∥ σ) \geq D_{α} (E (ρ) ∥ E (σ)), D_{m a x}^{ε} (ρ ∥ σ) \geq D_{m a x}^{ε} (E (ρ) ∥ E (σ)) .

D_{h}^{ε} (ρ ∥ σ) \geq D_{h}^{ε} (E (ρ) ∥ E (σ)), D_{α} (ρ ∥ σ) \geq D_{α} (E (ρ) ∥ E (σ)), D_{m a x}^{ε} (ρ ∥ σ) \geq D_{m a x}^{ε} (E (ρ) ∥ E (σ)) .

D_{m a x}^{ε} (ρ ∥ σ) = Tr [ M σ ] \leq 1 M \geq 0 sup \tilde{ρ} \in B^{ε} (ρ) in f lo g Tr [M \tilde{ρ}] .

D_{m a x}^{ε} (ρ ∥ σ) = Tr [ M σ ] \leq 1 M \geq 0 sup \tilde{ρ} \in B^{ε} (ρ) in f lo g Tr [M \tilde{ρ}] .

D_{m a x}^{ε, P} (ρ ∥ σ)

D_{m a x}^{ε, P} (ρ ∥ σ)

Tr [ M σ ] \leq 1 M \geq 0 sup \tilde{ρ} \in B (ρ) in f tr [M \tilde{ρ}] \leq 2^{D_{α} (ρ ∥ σ)} \cdot g (ε)^{\frac{1}{α - 1}} h (ε),

Tr [ M σ ] \leq 1 M \geq 0 sup \tilde{ρ} \in B (ρ) in f tr [M \tilde{ρ}] \leq 2^{D_{α} (ρ ∥ σ)} \cdot g (ε)^{\frac{1}{α - 1}} h (ε),

p_{i} := ⟨ v_{i} ∣ ρ ∣ v_{i} ⟩, q_{i} := ⟨ v_{i} ∣ σ ∣ v_{i} ⟩, and I := {i : \frac{p _{i}}{q _{i}} > 2^{D_{α} (ρ ∥ σ)} \cdot g (ε)^{\frac{1}{α - 1}}}

p_{i} := ⟨ v_{i} ∣ ρ ∣ v_{i} ⟩, q_{i} := ⟨ v_{i} ∣ σ ∣ v_{i} ⟩, and I := {i : \frac{p _{i}}{q _{i}} > 2^{D_{α} (ρ ∥ σ)} \cdot g (ε)^{\frac{1}{α - 1}}}

2^{(α - 1) \cdot D_{α} (ρ ∥ σ)} \geq i \sum p_{i}^{α} q_{i}^{1 - α} \geq i \in I \sum p_{i} (\frac{p _{i}}{q _{i}})^{α - 1} \geq i \in I \sum p_{i} (2^{D_{α} (ρ ∥ σ)} \cdot g (ε)^{\frac{1}{α - 1}})^{α - 1},

2^{(α - 1) \cdot D_{α} (ρ ∥ σ)} \geq i \sum p_{i}^{α} q_{i}^{1 - α} \geq i \in I \sum p_{i} (\frac{p _{i}}{q _{i}})^{α - 1} \geq i \in I \sum p_{i} (2^{D_{α} (ρ ∥ σ)} \cdot g (ε)^{\frac{1}{α - 1}})^{α - 1},

tr Π ρ = i \in I \sum p_{i} \leq g (ε)^{- 1} = ε^{2} .

tr Π ρ = i \in I \sum p_{i} \leq g (ε)^{- 1} = ε^{2} .

\tilde{ρ} := \frac{( 1 - Π ) ρ ( 1 - Π )}{1 - tr Π ρ},

\tilde{ρ} := \frac{( 1 - Π ) ρ ( 1 - Π )}{1 - tr Π ρ},

(1 - tr Π ρ) tr M \tilde{ρ}

(1 - tr Π ρ) tr M \tilde{ρ}

D_{h}^{1 - ε} (ρ ∥ σ) \geq D_{m a x}^{ε, P} (ρ ∥ σ) - lo g \frac{1}{1 - ε} \geq D_{h}^{1 - ε - δ} (ρ ∥ σ) - lo g \frac{4}{δ ^{2}} .

D_{h}^{1 - ε} (ρ ∥ σ) \geq D_{m a x}^{ε, P} (ρ ∥ σ) - lo g \frac{1}{1 - ε} \geq D_{h}^{1 - ε - δ} (ρ ∥ σ) - lo g \frac{4}{δ ^{2}} .

tr M \tilde{ρ} \leq \frac{1}{ε ^{'}} 2^{D_{h}^{ε^{'}} (ρ ∥ σ)},

tr M \tilde{ρ} \leq \frac{1}{ε ^{'}} 2^{D_{h}^{ε^{'}} (ρ ∥ σ)},

D_{h}^{ε^{'}} (ρ ∥ σ) \geq D_{h}^{ε^{'}} (P ∥ Q) \geq D_{s}^{ε^{'}} (P ∥ Q) =: K .

D_{h}^{ε^{'}} (ρ ∥ σ) \geq D_{h}^{ε^{'}} (P ∥ Q) \geq D_{s}^{ε^{'}} (P ∥ Q) =: K .

tr Π ρ = tr M (Π) ρ = tr Π M (ρ) = 1 - P (I) \leq ε .

tr Π ρ = tr M (Π) ρ = tr Π M (ρ) = 1 - P (I) \leq ε .

tr M \tilde{ρ} \leq \frac{1}{ε ^{'}} i \in I \sum m_{i} P (i) \leq \frac{2 ^{K + η}}{ε ^{'}} \cdot i \in I \sum m_{i} Q (i) = \frac{2 ^{K + η}}{ε ^{'}} Tr [M σ] \leq \frac{2 ^{D_{h}^{ε^{'}} (ρ ∥ σ) + η}}{ε ^{'}} .

tr M \tilde{ρ} \leq \frac{1}{ε ^{'}} i \in I \sum m_{i} P (i) \leq \frac{2 ^{K + η}}{ε ^{'}} \cdot i \in I \sum m_{i} Q (i) = \frac{2 ^{K + η}}{ε ^{'}} Tr [M σ] \leq \frac{2 ^{D_{h}^{ε^{'}} (ρ ∥ σ) + η}}{ε ^{'}} .

\tilde{ρ} \leq 2^{λ} σ with λ = D_{m a x}^{ε, P} (ρ ∥ σ),

\tilde{ρ} \leq 2^{λ} σ with λ = D_{m a x}^{ε, P} (ρ ∥ σ),

1 - ε = \overset{ˉ}{F} (ρ, \tilde{ρ})

1 - ε = \overset{ˉ}{F} (ρ, \tilde{ρ})

\leq tr Q \tilde{ρ} + tr (1 - Q) ρ

\leq 2^{λ} tr Q σ + 1 - ε - δ .

lo g (1 - ε - 1 - ε - δ)^{2} \leq D_{m a x}^{ε, P} (ρ ∥ σ) - D_{h}^{1 - ε - δ} (ρ ∥ σ) .

lo g (1 - ε - 1 - ε - δ)^{2} \leq D_{m a x}^{ε, P} (ρ ∥ σ) - D_{h}^{1 - ε - δ} (ρ ∥ σ) .

D_{m a x} (\tilde{ρ}_{A} ∥ σ_{A}) \leq D_{h}^{1 - ε} (ρ_{A} ∥ σ_{A}) + Δ and D_{m a x} (\tilde{ρ}_{B} ∥ σ_{B}) \leq D_{h}^{1 - ε^{'}} (ρ_{B} ∥ σ_{B}) + Δ

D_{m a x} (\tilde{ρ}_{A} ∥ σ_{A}) \leq D_{h}^{1 - ε} (ρ_{A} ∥ σ_{A}) + Δ and D_{m a x} (\tilde{ρ}_{B} ∥ σ_{B}) \leq D_{h}^{1 - ε^{'}} (ρ_{B} ∥ σ_{B}) + Δ

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

A minimax approach to one-shot entropy inequalities

Anurag Anshu

Institute for Quantum Computing, University of Waterloo, Waterloo, Canada

Perimeter Institute for Theoretical Physics, Waterloo, Canada

Mario Berta

Department of Computing, Imperial College London, England

Rahul Jain

Center for Quantum Technologies, National University of Singapore and MajuLab, UMI 3654, Singapore

Marco Tomamichel

Centre for Quantum Software and Information, University of Technology Sydney, Sydney

Center for Quantum Technologies, National University of Singapore, Singapore

Abstract

One-shot information theory entertains a plethora of entropic quantities, such as the smooth max-divergence, hypothesis testing divergence and information spectrum divergence, that characterize various operational tasks and are used to prove the asymptotic behavior of various tasks in quantum information theory. Tight inequalities between these quantities are thus of immediate interest. In this note we use a minimax approach (appearing previously for example in the proofs of the quantum substate theorem), to simplify the quantum problem to a commutative one, which allows us to derive such inequalities. Our derivations are conceptually different from previous arguments and in some cases lead to tighter relations. We hope that the approach discussed here can lead to progress in open problems in quantum Shannon theory, and exemplify this by applying it to a simple case of the joint smoothing problem.

I Introduction

Recent years have seen remarkable progress in the area of one-shot quantum Shannon theory, which generalizes the standard asymptotic and i.i.d. (independent and identically distributed) quantum Shannon theory and also eases the notational complications in the latter. Achievability results in the one-shot setting clarify a lot about the structure of the protocol, as various entropic equalities that are equivalent in the asymptotic and i.i.d. setting are vastly different in the one shot setting. This setting also forces the development of novel encoding and decoding schemes that would have been trivial if the time sharing method was used in the asymptotic and i.i.d. setting.

A (minor) downside of one-shot information theory is that there can be various quantities that seem to generalize the entropic quantities such as the relative entropy. Below, we introduce various such quantities that will be considered in this work. We focus here on relative entropies, but relations for other entropic quantities like entropy, conditional entropy and mutual information can often be derived readily using the fact that they can be expressed in terms of relative entropies.

I.1 Notation and definitions

We will fix a finite-dimensional Hilbert space throughout most of this manuscript and denote with $\mathcal{P}$ and $\mathcal{S}$ the set of positive semi-definite operators and the subset of trace-normalized quantum states, respectively. Sometimes we will refer to the set of sub-normalized states, denoted $\mathcal{S}_{\bullet}$ , which contains all positive semi-definite operators $\rho\geq 0$ (using the Löwner partial order) with $0<\operatorname{tr}(\rho)\leq 1$ . When joint quantum systems are considered, we use the notation $\mathcal{S}(AB)$ etc. to denote joint quantum states on the Hilbert spaces $A$ and $B$ .

Some of the entropic quantities will require the concept of a neighbourhood, namely a function $\mathcal{B}$ that maps $\rho\in\mathcal{S}$ to an $\varepsilon$ -neighbourhood $\mathcal{B}^{\varepsilon}(\rho)\subset\mathcal{S}$ of $\rho$ . We can also define neighbourhoods of sub-normalized states in the same way. We will always require that, for any $\rho\in\mathcal{S}_{\bullet}$ , the set $\mathcal{B}^{\varepsilon}(\rho)$ is convex and at least contains $\rho$ . Such $\varepsilon$ -neighbourhoods can easily be constructed from any metric on states, and the two most prominent examples are defined below for any $\varepsilon\in[0,1)$ . The first is the neighbourhood of states that are close in trace distance, $T(\rho,\sigma):=\frac{1}{2}\|\rho-\sigma\|$ , given as

[TABLE]

The second is the neighbourhood of sub-normalized states that are close in purified distance tomamichel09 ,

[TABLE]

where $P(\rho,\sigma)=\sqrt{1-\bar{F}(\rho,\sigma)}$ and $\bar{F}(\rho,\sigma)=\big{(}\|\sqrt{\rho}\sqrt{\sigma}\|_{1}+\sqrt{(1-\operatorname{tr}\rho)(1-\operatorname{tr}\sigma)}\big{)}^{2}$ is a generalization of the fidelity to sub-normalized states.

We are now ready to define our entropic quantities of interest. The max-divergence is defined for any $\rho\in\mathcal{S}_{\bullet}$ and $\sigma\in\mathcal{P}$ as

[TABLE]

Note that by definition of the infimum this quantity takes on the value $+\infty$ in case there does not exist a $\lambda$ satisfying the constraint $\rho\leq 2^{\lambda}\sigma$ , which happens if and only if the support of $\rho$ is not contained in the support of $\sigma$ . Otherwise, the minimum is achieved and takes the value $\lambda^{*}=\log\|\sigma^{-\frac{1}{2}}\rho\sigma^{-\frac{1}{2}}\|_{\infty}$ , where we used the Moore-Penrose inverse. Using any neighbourhood ball $\mathcal{B}^{\varepsilon}$ , we define an $\varepsilon$ -smooth max-divergence as renner05 ; datta08

[TABLE]

We will use the notation $D_{\max}^{\varepsilon,P}$ and $D_{\max}^{\varepsilon,T}$ to specify the balls $\mathcal{B}_{P}^{\varepsilon}$ and $\mathcal{B}_{T}^{\varepsilon}$ , respectively.

The max-divergence is a limiting case of a Rényi divergence lennert13 ; wilde13 , namely the family

[TABLE]

for $\alpha\in[\frac{1}{2},1)\cup(1,\infty)$ defined for any $\rho\in\mathcal{S}_{\bullet}$ and $\sigma\in\mathcal{P}$ . The max-relative divergence is recovered in the limit $\alpha\to\infty$ and the name is justified since the family is monotonically increasing as a function of $\alpha$ . In the limit $\alpha\to 1$ , we recover the relative entropy:

[TABLE]

Asymmetric quantum hypothesis testing plays a crucial role in one-shot quantum information theory. The fundamental relationship between errors of the first and second kind can be cast as an entropic quantity. Bounding the error of the first kind with $\varepsilon\in[0,1)$ and minimizing the error of the second kind, the $\varepsilon$ -hypothesis testing divergence is defined as

[TABLE]

For any Hermitian operator $X$ , let $\{X\}_{+}$ be the projector onto the subspace spanned by all the eigenvectors with positive eigenvalue. We define the $\varepsilon$ -information spectrum divergence as

[TABLE]

This quantity gives a potential quantum generalization of the notion of $\varepsilon$ -tail bounds of the log-likelihood ratio function. To see this, note that for $P,Q$ two probability distributions, the above expression simplifies to

[TABLE]

Its usefulness, apart from this simple interpretation, is mainly due to its close relation to hypothesis testing, shown in the following relation from (tomamichel12, , Lemma 12): For any $\rho\in\mathcal{S}$ , $\sigma\in\mathcal{P}$ and $\varepsilon,\delta\in(0,1)$ with $\varepsilon+\delta<1$ , it holds that

[TABLE]

I.2 Some useful properties of above quantities

The purified distance satisfies the following ‘gentle measurement’ property, which has first been established in (tomamichel17b, , Lemma 7). Since the relation between the below lemma and the result in tomamichel17b is not imediately obvious, we provide a proof in Appendix A for the convenience of the reader.

Lemma 1.

For any projector $P$ and $\rho\in\mathcal{S}_{\bullet}$ , we have

[TABLE]

It is worth noting that the state $\tilde{\rho}$ is only normalized if $\rho\in\mathcal{S}$ and sub-normalized otherwise. The special case of normalized $\rho$ is in fact well-known, and in that case we also have $T(\rho,\tilde{\rho})\leq P(\rho,\tilde{\rho})=\sqrt{\operatorname{tr}P\rho}$ by the Fuchs-van de Graaf inequality.

Many of these entropic quantities satisfy the data processing datta08 ; beigi13 ; frank13 . That is, for any quantum channel (a completely positive and trace-preserving map) $\mathcal{E}$ , it holds that

[TABLE]

Data processing for the information spectrum divergence is not as simple, but an approximate data-processing inequality can be deduced from (9). Thus, information spectrum divergence is known to satisfy data processing only up to an additive logarithmic term.

II Relating various information theoretic measures

Our central idea is inspired by the works JainRS02 ; Jain:2009 ; JainN12 on the quantum substate theorem, which show that we can use a minimax approach to find the optimal smoothing of the max-divergence. More precisely, we use the following straight-forward generalization of a key result from JainN12 , a proof of which is given in Appendix B for the convenience of the reader.

Lemma 2.

Let $\rho\in\mathcal{S}_{\bullet}$ , $\sigma\in\mathcal{P}$ . For any convex $\varepsilon$ -neighbourhood $\mathcal{B}^{\varepsilon}(\rho)$ , we have

[TABLE]

II.1 Smooth max-divergence and Rényi relative entropies

Our first application is a relation between $\varepsilon$ -smoth max-divergence and Rényi divergence, which improves on (mythesis, , Proposition 6.5) for the purified distance smoothing (which was shown using a different method) and is new for normalized trace distance smoothing. Our proof closely follows the proof of the quantum substate theorem in JainN12 .

Theorem 3.

Let $\rho\in\mathcal{S}_{\bullet}$ , $\sigma\in\mathcal{P}$ . For any $\varepsilon\in(0,1)$ and $\alpha>1$ , we have

[TABLE]

The same inequality also holds with $D_{\max}^{\varepsilon,P}$ replaced by $D_{\max}^{\varepsilon,T}$ with $\rho\in\mathcal{S}$ .

Proof.

Invoking Lemma 2 the claim becomes equivalent to

[TABLE]

where we introduced $g(\varepsilon)=\frac{1}{\varepsilon^{2}}$ and $h(\varepsilon)=\frac{1}{1-\varepsilon^{2}}$ for convenience. That is, for every $M$ with $\operatorname{tr}(M\sigma)\leq 1$ it is sufficient to produce a corresponding $\tilde{\rho}\in\mathcal{B}(\rho)$ that fulfils the bound. For such an $M$ with spectral decomposition $M=\sum_{i}m_{i}|v_{i}\rangle\langle v_{i}|$ , and $\alpha>1$ , define

[TABLE]

and finally $\Pi:=\sum_{i\in I}|v_{i}\rangle\langle v_{i}|$ . We now invoke the data-processing inequality for the quantum Rényi divergences under the projective measurement $\{|v_{i}\rangle\!\langle v_{i}|\}_{i}$ , leading to

[TABLE]

where the last inequality follows from the definition of $I$ . This implies that

[TABLE]

We are now ready to define our smoothed state,

[TABLE]

which is normalized if and only if $\rho$ is normalized (and otherwise sub-normalized). By Lemma 1 we find that $P(\rho,\tilde{\rho})=\sqrt{\operatorname{tr}P\rho}\leq\varepsilon$ , and thus this state lies in both $\mathcal{B}_{P}^{\varepsilon}$ and $\mathcal{B}_{T}^{\varepsilon}$ . Furthermore,

[TABLE]

where the penultimate inequality follows from the definition of $I$ and the last inequality follows from $\sum_{i}q_{i}\cdot m_{i}=\operatorname{tr}M\sigma\leq 1$ . Finally, we bound $\frac{1}{1-\operatorname{tr}\Pi\rho}\leq\frac{1}{1-\varepsilon^{2}}=h(\varepsilon)$ , concluding the proof. ∎

II.2 Relating smooth max-divergence and asymmetric hypothesis testing

One of the main results in tomamichel12 was to establish a close relation between the smooth max-divergence and asymmetric hypothesis testing, which were then used to derive asymptotic bounds. The following relation improves on two bounds established in (tomamichel12, , Proposition 13) and (dupuis12, , Proposition 4.1).

Theorem 4.

Let $\rho\in\mathcal{S}$ , $\sigma\in\mathcal{P}$ and $\varepsilon\in(0,1)$ and $\delta\in(0,1-\varepsilon^{2})$ . It holds that

[TABLE]

We note in particular that our new upper bound on $D_{\max}^{\varepsilon,P}(\rho\|\sigma)$ does not depend on the number of distinct eigenvalues of $\sigma$ , in contrast to the result in (tomamichel12, , Proposition 13). It is also tight in $\varepsilon$ , unlike the bound in (dupuis12, , Proposition 4.1). This is particularly relevant when attempting to generalize these relations to the infinite-dimensional case.

Proof.

We start with the first inequality. Using Lemma 2, we fix an arbitrary $M\geq 0$ such that $\mathrm{Tr}[M\sigma]\leq 1$ and it suffices to construct a state $\tilde{\rho}\in\mathcal{B}_{P}^{\varepsilon}$ such that

[TABLE]

where we set $\varepsilon^{\prime}=1-\varepsilon$ for convenience. Given the spectral decomposition $M=\sum_{i}m_{i}|v_{i}\rangle\!\langle v_{i}|$ , we define $\mathcal{M}$ as the measurement in the basis $\{|v_{i}\rangle\}_{i}$ and two probability distributions $P:=\mathcal{M}(\rho)$ and $Q:=\mathcal{M}(\sigma)$ obtained by measuring $\rho$ and $\sigma$ in this basis. The data-processing inequality for the hypothesis testing divergence and (9) yield

[TABLE]

Let us now, for any $\eta>0$ , define the set $I:=\{i:P(i)\leq 2^{K+\eta}Q(i)\}$ such that $P(I)>\varepsilon^{\prime}$ by definition of $D_{s}^{\varepsilon^{\prime}}(P\|Q)$ . Moreover, let $\Pi:=\sum_{i\not\in I}|v_{i}\rangle\!\langle v_{i}|$ . We have

[TABLE]

And, thus, according to Lemma 1, we have $P(\rho,\tilde{\rho})\leq\sqrt{\varepsilon}$ for the choice $\tilde{\rho}:=\frac{(1-\Pi)\rho(1-\Pi)}{1-\operatorname{tr}\Pi\rho}$ . Finally, using that $1-\operatorname{tr}\Pi\rho>\varepsilon^{\prime}$ by (23), we find

[TABLE]

The first inequality then follows in the limit $\eta\to 0$ .

To show the second inequality, we follow the ideas in tomamichel12 . Let $\tilde{\rho}\in\mathcal{B}_{P}^{\varepsilon}$ be such that

[TABLE]

that is, the state $\tilde{\rho}$ is an optimal smooth state. Moreover, consider the optimal hypothesis test $0\leq Q\leq 1$ satisfying $\operatorname{tr}(1-Q)\rho=1-\varepsilon-\delta$ and $\log\operatorname{tr}Q\sigma=-D_{h}^{1-\varepsilon-\delta}(\rho\|\sigma)$ . Then, the data-processing inequality for the fidelity and applied to the positive operator-valued measurement $\{Q,1-Q\}$ yields the following sequence of inequalities:

[TABLE]

Substituting for $\lambda$ and $\operatorname{tr}Q\sigma$ , we thus arrive at the inequality

[TABLE]

Further bounding $\sqrt{1-\varepsilon}-\sqrt{1-\varepsilon-\delta}\geq\frac{\delta}{2\sqrt{1-\varepsilon}}$ yields the desired result. ∎

III Joint smoothing relative to arbitrary states

Simultanenous smoothing is a question of great interest in quantum Shannon theory, with recent progress such as in Sen18 ; drescher13 having new consequences in network scenarios. Here we show simultaneous smoothing for the two marginals of joint quantum system $AB$ . In contrast to earlier results on joint smoothing, our technique allows to smooth relative to an arbitrary positive operator, and this operator can in fact be different for the two marginals. If we choose these operators to be identity, our result reduces to the usual case considered in the literature drescher13 . We hope that the approach can lead to more progress on the simultaneous smoothing question.

Theorem 5.

Let $\rho_{AB}\in\mathcal{S}(AB)$ with marginals $\rho_{A}$ and $\rho_{B}$ , and let $\sigma_{A}\in\mathcal{P}(A)$ , $\sigma_{B}\in\mathcal{P}(B)$ . For any $\varepsilon,\varepsilon^{\prime}\in(0,1)$ such that $\varepsilon+\varepsilon^{\prime}<1$ , there exists a state $\tilde{\rho}_{AB}\in\mathcal{S}(AB)$ with $P(\rho_{AB},\tilde{\rho}_{AB})\leq\sqrt{\varepsilon+\varepsilon^{\prime}}$ such that its marginals $\tilde{\rho}_{A}$ and $\tilde{\rho}_{B}$ satisfy

[TABLE]

for $\Delta=-\log(1-\varepsilon-\varepsilon^{\prime})$ .

Proof.

Let us first confirm that it suffices, for every $\eta>0$ , to construct a normalized state $\tilde{\rho}_{AB}\in\mathcal{B}_{P}^{\sqrt{\delta}}(\rho_{AB})$ for $\delta=\varepsilon+\varepsilon^{\prime}$ that satisfies the following operator inequalities:

[TABLE]

where $\lambda_{A}=D_{h}^{1-\varepsilon^{\prime}}(\rho_{A}\|\sigma_{A})+\Delta+\eta$ and $\lambda_{B}=D_{h}^{1-\varepsilon^{\prime\prime}}(\rho_{B}\|\sigma_{B})+\Delta+\eta$ . Then, the inequalities in (30) are implied since $\eta>0$ is arbitrarily small. Consider now

[TABLE]

where the infimum and supremum can be interchanged using Sion’s minimax theorem sion58 . Clearly $\mathrm{Opt}\leq 0$ implies the existence of a state satisfying the desiderate in (31). Using the minimax principle on (32), it thus suffices to construct, for every fixed $M_{A}$ and $M_{B}$ , a $\tilde{\rho}_{AB}\in\mathcal{S}(AB)$ with $P(\rho_{AB},\tilde{\rho}_{AB})\leq\sqrt{\delta}$ such that $\operatorname{tr}M_{A}\tilde{\rho}_{A}\leq 2^{\lambda_{A}}\operatorname{tr}M_{A}\sigma_{A}$ and $\operatorname{tr}M_{B}\tilde{\rho}_{B}\leq 2^{\lambda_{B}}\operatorname{tr}M_{B}\sigma_{B}$ .

The proof now proceeds similarly to the proof of Theorem 4, where more detail is given. Given the eigenvalue decomposition $M_{A}=\sum_{i}m_{A}(i)|v_{i}\rangle\!\langle v_{i}|_{A}$ of $M_{A}$ , the measurement $\mathcal{M}_{A}$ in its eigenbasis, and the two probability distributions $P_{A}=\mathcal{M}_{A}(\rho_{A})$ and $Q_{A}=\mathcal{M}_{A}(\sigma_{A})$ , we find

[TABLE]

We then define the set $I_{A}=\{i:P(i)\leq 2^{K_{A}+\eta}Q(i)\}$ such that $P_{A}(I_{A})>1-\varepsilon$ . As a consequence, the projector $\Pi_{A}:=\sum_{i\in I_{A}}|v_{i}\rangle\!\langle v_{i}|_{A}$ satisfies

[TABLE]

The exact same construction for $B$ yields $\Pi_{B}$ with $\operatorname{tr}\Pi_{B}\rho_{B}\geq 1-\varepsilon^{\prime}$ . Consequently, we establish

[TABLE]

where we used the fact that $\operatorname{tr}_{A}(P_{A}\otimes 1_{B})X_{AB}\leq X_{B}$ for every projector $P_{A}$ and positive operator $X_{AB}$ with marginal $X_{B}$ (see, e.g., (mythesis, , Lemma A.1) for a proof of a more general statement).

Now, we are ready to define the (normalized) smoothed state

[TABLE]

such that Lemma 1 together with (36) yields $P(\rho,\tilde{\rho})\leq\sqrt{\delta}$ . Moreover,

[TABLE]

Finally, since $\sum_{i\in I_{A}}m_{A}(i)Q_{A}(i)\leq\operatorname{tr}(M_{A}\sigma_{A})$ and $\frac{2^{K_{A}+\eta}}{1-\delta}=2^{\lambda_{A}}$ , the first inequality in (31) follows. The analogous argument for $B$ also verifies the second inequality in (31), concluding the proof. ∎

Using Theorem 4, we can further replace $D_{h}^{1-\varepsilon}$ with $D_{\max}^{\sqrt{\varepsilon}}$ and $D_{h}^{1-\varepsilon^{\prime}}$ with $D_{\max}^{\sqrt{\varepsilon^{\prime}}}$ (introducing some small correction), which yields the following corollary.

Corollary 6.

Let $\rho_{AB}\in\mathcal{S}(AB)$ with marginals $\rho_{A}$ and $\rho_{B}$ , and let $\sigma_{A}\in\mathcal{P}(A)$ , $\sigma_{B}\in\mathcal{P}(B)$ . For any $\varepsilon,\varepsilon^{\prime},\delta\in(0,1)$ such that $\varepsilon+\varepsilon^{\prime}+2\delta<1$ , there exists a state $\tilde{\rho}_{AB}\in\mathcal{S}(AB)$ with $P(\rho_{AB},\tilde{\rho}_{AB})\leq\sqrt{\varepsilon+\varepsilon^{\prime}+2\delta}$ such that its marginals $\tilde{\rho}_{A}$ and $\tilde{\rho}_{B}$ satisfy

[TABLE]

for $\Delta=2-2\log\delta-\log(1-\varepsilon-\varepsilon^{\prime}-2\delta)$ .

Acknowledgements.

The work was done when AA was affiliated to the Centre for Quantum Technologies, National University of Singapore. We thank David Sutter for help with the proof of Theorem 3 for normalized trace distance. AA and RJ were supported by the Singapore Ministry of Education and the National Research Foundation through the “NRF2017-NRF-ANR004 VanQuTe” grant. RJ is also supported by VAJRA Faculty Scheme of the Science and Engineering Board (SERB), Department of Science and Technology (DST), Government of India.

Appendix A Proof of Lemma 1

Proof of Lemma 1.

We need to verify that $\bar{F}(\rho,\tilde{\rho})=1-\operatorname{tr}P\rho$ for $\tilde{\rho}=\frac{(1-P)\rho(1-P)}{1-\operatorname{tr}P\rho}$ . Indeed,

[TABLE]

by a simple computation. ∎

Appendix B Proof of Lemma 2

Proof of Lemma 2.

Recall the definition of the max-divergence, $D_{\max}(\rho\|\sigma)=\log\inf_{\rho\leq\lambda\sigma}\lambda$ . We first show the following identity:

[TABLE]

The direction ‘ $\geq$ ’ follows directly from the fact that

[TABLE]

for all $X\geq 0$ , since the restriction on $\lambda$ on the right-hand side is less restrictive.

For the direction ‘ $\leq$ ’, we simply need to construct an operator $X\geq 0$ such that the infimum on the right-hand side of (45) matches the left-hand side. We first consider the case where $\inf_{\rho\leq\lambda\sigma}\lambda=\infty$ , i.e. the case where the support of $\rho$ is not contained in the support of $\sigma$ . In this case we can choose $X$ to be orthogonal to $\sigma$ but with $\operatorname{tr}X\rho>0$ , such that indeed also $\inf_{\operatorname{tr}X\tilde{\rho}\leq\lambda\operatorname{tr}X\sigma}\lambda=\infty$ . Otherwise, choose $\lambda^{*}=\operatorname{argmin}_{\rho\leq\lambda\sigma}\lambda$ . With $X$ the projector onto the kernel of $\lambda^{*}\sigma-\rho$ , we find

[TABLE]

and thus $\inf_{\lambda\geq 0,\operatorname{tr}\,X\tilde{\rho}\leq\lambda\operatorname{tr}\,X\sigma}\lambda=\lambda^{*}$ , as required. Normalising $X$ such that $\operatorname{tr}X\sigma=1$ then yields

[TABLE]

And finally, using the definition of $\varepsilon$ -smooth max-divergence, we find

[TABLE]

Sion’s minimax theorem sion58 ensures that we can swap the infimum and the supremum since $\mathcal{B}^{\varepsilon}(\rho)$ and $\{X\geq 0,\operatorname{tr}X\sigma\leq 1\}$ are convex sets, which completes the proof. ∎

Bibliography17

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) S. Beigi. Sandwiched Rényi Divergence Satisfies Data Processing Inequality. J. Math. Phys. , 54(12):122202, 2013. DOI: 10.1063/1.4838855 . · doi ↗
2(2) N. Datta. Min- and Max- Relative Entropies and a New Entanglement Monotone. IEEE Trans. on Inf. Theory , 55(6):2816–2826, 2009. DOI: 10.1109/TIT.2009.2018325 . · doi ↗
3(3) L. Drescher and O. Fawzi. On simultaneous min-entropy smoothing. In 2013 IEEE International Symposium on Information Theory , pages 161–165, 2013. DOI: 10.1109/ISIT.2013.6620208 . · doi ↗
4(4) F. Dupuis, L. Kraemer, P. Faist, J. M. Renes, and R. Renner. Generalized Entropies. In Proc. of the XVI Ith Int. Congress on Math. Phys. , pages 134–153, Aalborg, Denmark, 2012. DOI: 10.1142/9789814449243_0008 . · doi ↗
5(5) R. L. Frank and E. H. Lieb. Monotonicity of a Relative Rényi Entropy. J. Math. Phys. , 54(12):122201, 2013. DOI: 10.1063/1.4838835 . · doi ↗
6(6) R. Jain and A. Nayak. Short proofs of the quantum substate theorem. IEEE Transactions on Information Theory , 58(6):3664–3669, 2012. DOI: 10.1109/TIT.2012.2184522 . · doi ↗
7(7) R. Jain, J. Radhakrishnan, and P. Sen. Privacy and interaction in quantum communication complexity and a theorem about the relative entropy of quantum states. In The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings. , pages 429–438, 2002. DOI: 10.1109/SFCS.2002.1181967 . · doi ↗
8(8) R. Jain, J. Radhakrishnan, and P. Sen. A property of quantum relative entropy with an application to privacy in quantum communication. J. ACM , 56(6):33:1–33:32, 2009. DOI: 10.1145/1568318.1568323 . · doi ↗

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

A minimax approach to one-shot entropy inequalities

Abstract

I Introduction

I.1 Notation and definitions

I.2 Some useful properties of above quantities

Lemma 1**.**

II Relating various information theoretic measures

Lemma 2**.**

II.1 Smooth max-divergence and Rényi relative entropies

Theorem 3**.**

Proof.

II.2 Relating smooth max-divergence and asymmetric hypothesis testing

Theorem 4**.**

Proof.

III Joint smoothing relative to arbitrary states

Theorem 5**.**

Proof.

Corollary 6**.**

Acknowledgements.

Appendix A Proof of Lemma 1

Proof of Lemma 1.

Appendix B Proof of Lemma 2

Proof of Lemma 2.

Lemma 1.

Lemma 2.

Theorem 3.

Theorem 4.

Theorem 5.

Corollary 6.