A minimax approach to one-shot entropy inequalities
Anurag Anshu, Mario Berta, Rahul Jain, Marco Tomamichel

TL;DR
This paper introduces a minimax approach to derive tighter inequalities among one-shot entropic quantities in quantum information theory, simplifying complex quantum problems to commutative cases and potentially advancing quantum Shannon theory.
Contribution
It presents a novel minimax method that simplifies quantum entropy inequalities to commutative cases, leading to tighter bounds and new insights.
Findings
Derived tighter entropy inequalities using the minimax approach
Simplified quantum problems to commutative cases for easier analysis
Applied method to a joint smoothing problem in quantum Shannon theory
Abstract
One-shot information theory entertains a plethora of entropic quantities, such as the smooth max-divergence, hypothesis testing divergence and information spectrum divergence, that characterize various operational tasks and are used to prove the asymptotic behavior of various tasks in quantum information theory. Tight inequalities between these quantities are thus of immediate interest. In this note we use a minimax approach (appearing previously for example in the proofs of the quantum substate theorem), to simplify the quantum problem to a commutative one, which allows us to derive such inequalities. Our derivations are conceptually different from previous arguments and in some cases lead to tighter relations. We hope that the approach discussed here can lead to progress in open problems in quantum Shannon theory, and exemplify this by applying it to a simple case of the joint…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A minimax approach to one-shot entropy inequalities
Anurag Anshu
Institute for Quantum Computing, University of Waterloo, Waterloo, Canada
Perimeter Institute for Theoretical Physics, Waterloo, Canada
Mario Berta
Department of Computing, Imperial College London, England
Rahul Jain
Center for Quantum Technologies, National University of Singapore and MajuLab, UMI 3654, Singapore
Marco Tomamichel
Centre for Quantum Software and Information, University of Technology Sydney, Sydney
Center for Quantum Technologies, National University of Singapore, Singapore
Abstract
One-shot information theory entertains a plethora of entropic quantities, such as the smooth max-divergence, hypothesis testing divergence and information spectrum divergence, that characterize various operational tasks and are used to prove the asymptotic behavior of various tasks in quantum information theory. Tight inequalities between these quantities are thus of immediate interest. In this note we use a minimax approach (appearing previously for example in the proofs of the quantum substate theorem), to simplify the quantum problem to a commutative one, which allows us to derive such inequalities. Our derivations are conceptually different from previous arguments and in some cases lead to tighter relations. We hope that the approach discussed here can lead to progress in open problems in quantum Shannon theory, and exemplify this by applying it to a simple case of the joint smoothing problem.
I Introduction
Recent years have seen remarkable progress in the area of one-shot quantum Shannon theory, which generalizes the standard asymptotic and i.i.d. (independent and identically distributed) quantum Shannon theory and also eases the notational complications in the latter. Achievability results in the one-shot setting clarify a lot about the structure of the protocol, as various entropic equalities that are equivalent in the asymptotic and i.i.d. setting are vastly different in the one shot setting. This setting also forces the development of novel encoding and decoding schemes that would have been trivial if the time sharing method was used in the asymptotic and i.i.d. setting.
A (minor) downside of one-shot information theory is that there can be various quantities that seem to generalize the entropic quantities such as the relative entropy. Below, we introduce various such quantities that will be considered in this work. We focus here on relative entropies, but relations for other entropic quantities like entropy, conditional entropy and mutual information can often be derived readily using the fact that they can be expressed in terms of relative entropies.
I.1 Notation and definitions
We will fix a finite-dimensional Hilbert space throughout most of this manuscript and denote with and the set of positive semi-definite operators and the subset of trace-normalized quantum states, respectively. Sometimes we will refer to the set of sub-normalized states, denoted , which contains all positive semi-definite operators (using the Löwner partial order) with . When joint quantum systems are considered, we use the notation etc. to denote joint quantum states on the Hilbert spaces and .
Some of the entropic quantities will require the concept of a neighbourhood, namely a function that maps to an -neighbourhood of . We can also define neighbourhoods of sub-normalized states in the same way. We will always require that, for any , the set is convex and at least contains . Such -neighbourhoods can easily be constructed from any metric on states, and the two most prominent examples are defined below for any . The first is the neighbourhood of states that are close in trace distance, , given as
[TABLE]
The second is the neighbourhood of sub-normalized states that are close in purified distance tomamichel09 ,
[TABLE]
where and \bar{F}(\rho,\sigma)=\big{(}\|\sqrt{\rho}\sqrt{\sigma}\|_{1}+\sqrt{(1-\operatorname{tr}\rho)(1-\operatorname{tr}\sigma)}\big{)}^{2} is a generalization of the fidelity to sub-normalized states.
We are now ready to define our entropic quantities of interest. The max-divergence is defined for any and as
[TABLE]
Note that by definition of the infimum this quantity takes on the value in case there does not exist a satisfying the constraint , which happens if and only if the support of is not contained in the support of . Otherwise, the minimum is achieved and takes the value , where we used the Moore-Penrose inverse. Using any neighbourhood ball , we define an -smooth max-divergence as renner05 ; datta08
[TABLE]
We will use the notation and to specify the balls and , respectively.
The max-divergence is a limiting case of a Rényi divergence lennert13 ; wilde13 , namely the family
[TABLE]
for defined for any and . The max-relative divergence is recovered in the limit and the name is justified since the family is monotonically increasing as a function of . In the limit , we recover the relative entropy:
[TABLE]
Asymmetric quantum hypothesis testing plays a crucial role in one-shot quantum information theory. The fundamental relationship between errors of the first and second kind can be cast as an entropic quantity. Bounding the error of the first kind with and minimizing the error of the second kind, the -hypothesis testing divergence is defined as
[TABLE]
For any Hermitian operator , let be the projector onto the subspace spanned by all the eigenvectors with positive eigenvalue. We define the -information spectrum divergence as
[TABLE]
This quantity gives a potential quantum generalization of the notion of -tail bounds of the log-likelihood ratio function. To see this, note that for two probability distributions, the above expression simplifies to
[TABLE]
Its usefulness, apart from this simple interpretation, is mainly due to its close relation to hypothesis testing, shown in the following relation from (tomamichel12, , Lemma 12): For any , and with , it holds that
[TABLE]
I.2 Some useful properties of above quantities
The purified distance satisfies the following ‘gentle measurement’ property, which has first been established in (tomamichel17b, , Lemma 7). Since the relation between the below lemma and the result in tomamichel17b is not imediately obvious, we provide a proof in Appendix A for the convenience of the reader.
Lemma 1**.**
For any projector and , we have
[TABLE]
It is worth noting that the state is only normalized if and sub-normalized otherwise. The special case of normalized is in fact well-known, and in that case we also have by the Fuchs-van de Graaf inequality.
Many of these entropic quantities satisfy the data processing datta08 ; beigi13 ; frank13 . That is, for any quantum channel (a completely positive and trace-preserving map) , it holds that
[TABLE]
Data processing for the information spectrum divergence is not as simple, but an approximate data-processing inequality can be deduced from (9). Thus, information spectrum divergence is known to satisfy data processing only up to an additive logarithmic term.
II Relating various information theoretic measures
Our central idea is inspired by the works JainRS02 ; Jain:2009 ; JainN12 on the quantum substate theorem, which show that we can use a minimax approach to find the optimal smoothing of the max-divergence. More precisely, we use the following straight-forward generalization of a key result from JainN12 , a proof of which is given in Appendix B for the convenience of the reader.
Lemma 2**.**
Let , . For any convex -neighbourhood , we have
[TABLE]
II.1 Smooth max-divergence and Rényi relative entropies
Our first application is a relation between -smoth max-divergence and Rényi divergence, which improves on (mythesis, , Proposition 6.5) for the purified distance smoothing (which was shown using a different method) and is new for normalized trace distance smoothing. Our proof closely follows the proof of the quantum substate theorem in JainN12 .
Theorem 3**.**
Let , . For any and , we have
[TABLE]
The same inequality also holds with replaced by with .
Proof.
Invoking Lemma 2 the claim becomes equivalent to
[TABLE]
where we introduced and for convenience. That is, for every with it is sufficient to produce a corresponding that fulfils the bound. For such an with spectral decomposition , and , define
[TABLE]
and finally . We now invoke the data-processing inequality for the quantum Rényi divergences under the projective measurement , leading to
[TABLE]
where the last inequality follows from the definition of . This implies that
[TABLE]
We are now ready to define our smoothed state,
[TABLE]
which is normalized if and only if is normalized (and otherwise sub-normalized). By Lemma 1 we find that , and thus this state lies in both and . Furthermore,
[TABLE]
where the penultimate inequality follows from the definition of and the last inequality follows from . Finally, we bound , concluding the proof. ∎
II.2 Relating smooth max-divergence and asymmetric hypothesis testing
One of the main results in tomamichel12 was to establish a close relation between the smooth max-divergence and asymmetric hypothesis testing, which were then used to derive asymptotic bounds. The following relation improves on two bounds established in (tomamichel12, , Proposition 13) and (dupuis12, , Proposition 4.1).
Theorem 4**.**
Let , and and . It holds that
[TABLE]
We note in particular that our new upper bound on does not depend on the number of distinct eigenvalues of , in contrast to the result in (tomamichel12, , Proposition 13). It is also tight in , unlike the bound in (dupuis12, , Proposition 4.1). This is particularly relevant when attempting to generalize these relations to the infinite-dimensional case.
Proof.
We start with the first inequality. Using Lemma 2, we fix an arbitrary such that and it suffices to construct a state such that
[TABLE]
where we set for convenience. Given the spectral decomposition , we define as the measurement in the basis and two probability distributions and obtained by measuring and in this basis. The data-processing inequality for the hypothesis testing divergence and (9) yield
[TABLE]
Let us now, for any , define the set such that by definition of . Moreover, let . We have
[TABLE]
And, thus, according to Lemma 1, we have for the choice . Finally, using that by (23), we find
[TABLE]
The first inequality then follows in the limit .
To show the second inequality, we follow the ideas in tomamichel12 . Let be such that
[TABLE]
that is, the state is an optimal smooth state. Moreover, consider the optimal hypothesis test satisfying and . Then, the data-processing inequality for the fidelity and applied to the positive operator-valued measurement yields the following sequence of inequalities:
[TABLE]
Substituting for and , we thus arrive at the inequality
[TABLE]
Further bounding yields the desired result. ∎
III Joint smoothing relative to arbitrary states
Simultanenous smoothing is a question of great interest in quantum Shannon theory, with recent progress such as in Sen18 ; drescher13 having new consequences in network scenarios. Here we show simultaneous smoothing for the two marginals of joint quantum system . In contrast to earlier results on joint smoothing, our technique allows to smooth relative to an arbitrary positive operator, and this operator can in fact be different for the two marginals. If we choose these operators to be identity, our result reduces to the usual case considered in the literature drescher13 . We hope that the approach can lead to more progress on the simultaneous smoothing question.
Theorem 5**.**
Let with marginals and , and let , . For any such that , there exists a state with such that its marginals and satisfy
[TABLE]
for .
Proof.
Let us first confirm that it suffices, for every , to construct a normalized state for that satisfies the following operator inequalities:
[TABLE]
where and . Then, the inequalities in (30) are implied since is arbitrarily small. Consider now
[TABLE]
where the infimum and supremum can be interchanged using Sion’s minimax theorem sion58 . Clearly implies the existence of a state satisfying the desiderate in (31). Using the minimax principle on (32), it thus suffices to construct, for every fixed and , a with such that and .
The proof now proceeds similarly to the proof of Theorem 4, where more detail is given. Given the eigenvalue decomposition of , the measurement in its eigenbasis, and the two probability distributions and , we find
[TABLE]
We then define the set such that . As a consequence, the projector satisfies
[TABLE]
The exact same construction for yields with . Consequently, we establish
[TABLE]
where we used the fact that for every projector and positive operator with marginal (see, e.g., (mythesis, , Lemma A.1) for a proof of a more general statement).
Now, we are ready to define the (normalized) smoothed state
[TABLE]
such that Lemma 1 together with (36) yields . Moreover,
[TABLE]
Finally, since and , the first inequality in (31) follows. The analogous argument for also verifies the second inequality in (31), concluding the proof. ∎
Using Theorem 4, we can further replace with and with (introducing some small correction), which yields the following corollary.
Corollary 6**.**
Let with marginals and , and let , . For any such that , there exists a state with such that its marginals and satisfy
[TABLE]
for .
Acknowledgements.
The work was done when AA was affiliated to the Centre for Quantum Technologies, National University of Singapore. We thank David Sutter for help with the proof of Theorem 3 for normalized trace distance. AA and RJ were supported by the Singapore Ministry of Education and the National Research Foundation through the “NRF2017-NRF-ANR004 VanQuTe” grant. RJ is also supported by VAJRA Faculty Scheme of the Science and Engineering Board (SERB), Department of Science and Technology (DST), Government of India.
Appendix A Proof of Lemma 1
Proof of Lemma 1.
We need to verify that for . Indeed,
[TABLE]
by a simple computation. ∎
Appendix B Proof of Lemma 2
Proof of Lemma 2.
Recall the definition of the max-divergence, . We first show the following identity:
[TABLE]
The direction ‘’ follows directly from the fact that
[TABLE]
for all , since the restriction on on the right-hand side is less restrictive.
For the direction ‘’, we simply need to construct an operator such that the infimum on the right-hand side of (45) matches the left-hand side. We first consider the case where , i.e. the case where the support of is not contained in the support of . In this case we can choose to be orthogonal to but with , such that indeed also . Otherwise, choose . With the projector onto the kernel of , we find
[TABLE]
and thus , as required. Normalising such that then yields
[TABLE]
And finally, using the definition of -smooth max-divergence, we find
[TABLE]
Sion’s minimax theorem sion58 ensures that we can swap the infimum and the supremum since and are convex sets, which completes the proof. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) S. Beigi. Sandwiched Rényi Divergence Satisfies Data Processing Inequality. J. Math. Phys. , 54(12):122202, 2013. DOI: 10.1063/1.4838855 . · doi ↗
- 2(2) N. Datta. Min- and Max- Relative Entropies and a New Entanglement Monotone. IEEE Trans. on Inf. Theory , 55(6):2816–2826, 2009. DOI: 10.1109/TIT.2009.2018325 . · doi ↗
- 3(3) L. Drescher and O. Fawzi. On simultaneous min-entropy smoothing. In 2013 IEEE International Symposium on Information Theory , pages 161–165, 2013. DOI: 10.1109/ISIT.2013.6620208 . · doi ↗
- 4(4) F. Dupuis, L. Kraemer, P. Faist, J. M. Renes, and R. Renner. Generalized Entropies. In Proc. of the XVI Ith Int. Congress on Math. Phys. , pages 134–153, Aalborg, Denmark, 2012. DOI: 10.1142/9789814449243_0008 . · doi ↗
- 5(5) R. L. Frank and E. H. Lieb. Monotonicity of a Relative Rényi Entropy. J. Math. Phys. , 54(12):122201, 2013. DOI: 10.1063/1.4838835 . · doi ↗
- 6(6) R. Jain and A. Nayak. Short proofs of the quantum substate theorem. IEEE Transactions on Information Theory , 58(6):3664–3669, 2012. DOI: 10.1109/TIT.2012.2184522 . · doi ↗
- 7(7) R. Jain, J. Radhakrishnan, and P. Sen. Privacy and interaction in quantum communication complexity and a theorem about the relative entropy of quantum states. In The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings. , pages 429–438, 2002. DOI: 10.1109/SFCS.2002.1181967 . · doi ↗
- 8(8) R. Jain, J. Radhakrishnan, and P. Sen. A property of quantum relative entropy with an application to privacy in quantum communication. J. ACM , 56(6):33:1–33:32, 2009. DOI: 10.1145/1568318.1568323 . · doi ↗
