Reversibility of distance measures of states with some focus on total variation distance
Keiji Matsumoto

TL;DR
This paper investigates how classical probability distances relate to quantum states, showing that total variation distance can sometimes be preserved under measurement, contrary to previous assumptions, with specific conditions identified.
Contribution
The paper extends the understanding of distance measure reversibility from operator convex functions to strictly convex functions and provides conditions under which total variation distance remains unchanged.
Findings
Total variation distance can be preserved under measurement for certain quantum states.
Extension of reversibility results to strictly convex functions beyond operator convex functions.
Necessary and sufficient conditions identified for qubit states regarding total variation distance preservation.
Abstract
Consider a classical system, which is in the state described by probability distribution or , and embed these classical informations into quantum system by a physical map , and . Intuitively, the pair of the distributions of the data of the measurement on the pair should contain strictly less information than the pair provided the pair is non-commutative. Indeed, this statement had been shown if the information is measured by -divergence such that is operator convex. In the paper, the statement is extended to the case where is strictly convex. Also, we disprove the assertion for the total variation distance , the -divergence with : if satisfies some not very restrictive conditions, $\Vert…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematical Inequalities and Applications · Mathematical functions and polynomials · Quantum Information and Cryptography
Reversibility of distance mesures of states with some focus on total variation distance
Keiji Matsumoto
Quantum Computation Group, National Institute of Informatics,
2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430,
e-mail:[email protected]
Abstract
Consider a classical system, which is in the state described by probability distribution or , and embed these classical informations into quantum system by a physical map , and . Intuitively, the pair of the distributions of the data of the measurement on the pair should contain strictly less information than the pair provided the pair is non-commutative. Indeed, this statement had been shown if the information is measured by -divergence such that is operator convex. In the paper, the statement is extended to the case where is strictly convex. Also, we disprove the assertion for the total variation distance , the -divergence with : if satisfies some not very restrictive conditions, equals . Here we present sufficient condition for general case, and necessary and sufficient condition for qubit states.
1 Introduction
Consider a classical system, which is in the state described by probability distribution or , and embed these classical informations into quantum system by a physical map , and . Intuitively, the pair of the distributions of the data of the measurement on the pair should contain strictly less information than the pair provided the pair is non-commutative. Indeed, this statement had been shown if the information is measured by -divergence such that is operator convex [1]. In the paper, the statement is extended to the case where is strictly convex. Also, we disprove the assertion for the total variation distance , the -divergence with : if satisfies some not very restrictive conditions, equals . Here we present sufficient condition for general case, and necessary and sufficient condition for qubit states.
2 Embedding Classical Information Into Quantum States
Consider a classical memory system, whose state is described by probability distribution or depending on the value of the bit recorded. Suppose we embed this information into quantum system by some physical operation , or completely positive trace preserving (CPTP) map from commutative system into operators over Hilbert space . (In this paper, we stick to the finite dimensional case.) Then we obtain a quantum system whose state is either or depending on the value of the bit.
Suppose now we are given , and the question is how much of information is contained in . The answer relies on the measure of information, and also on the choice of . In the paper, we use - divergence between and to measure the amount of information:
[TABLE]
where is a finite set, and is a convex function. In the definition, we used the convention
[TABLE]
By choosing properly, - divergence represents almost all frequently used distance measures (or their monotone function): relative entropy (Kullback-Leibler divergence), Renyi relative entropy, total variation distance, and so on.
As for dependence of , we suppose the encoder did their best: thus our question is to find
[TABLE]
where moves over all the triple satisfying
[TABLE]
Since resulted from optimization problem, is monotone decreasing by CPTP maps. Also, when , it reduces to its classical version .
If is operator convex,
[TABLE]
provided and [2]. Examples are , and
[TABLE]
where the sign is chosen so that is convex are operator convex. The former and the latter corresponds to relative entropy and relative Renyi entropy, respectively.
However, the function , which corresponds to total variation distance
[TABLE]
is not operator convex, and there is no known closed formula for .
3 Reversibility
To read classical information from a quantum source , a measurement is applied to the system, to produce the probability distributions:
[TABLE]
Obviously,
[TABLE]
If , the identity in the above inequality holds: in fact, if is strictly convex
[TABLE]
this is the only possible case for the equality to holds. Some preparations are necessary to prove the assertion. Let and be a probability distribution over a finite set and , respectively. Define , and
[TABLE]
and are defined almost analogously.
Lemma 1
Suppose there is a transition probability with
[TABLE]
and there is a strictly convex function on with
[TABLE]
Then for all and with .
Proof. First, we prove the case where . Then decomposes into
[TABLE]
where is monotone non - increasing. Then
[TABLE]
[TABLE]
where
[TABLE]
Since and ,
[TABLE]
Since is strictly convex, holds only if
[TABLE]
and
[TABLE]
These are equivalent to
[TABLE]
Also, the condition implies
[TABLE]
Therefore, we have the assertion provided .
Next, we study the case where . Then implies and . Then doing almost analogously as above, we have the assertion.
Theorem 2
Suppose is strictly convex function on , and . Then the equality in the inequality (2) holds only if
If is non - linear and operator convex, it is strictly convex. Therefore, the theorem applies to relative and Renyi relative entropy.
Proof. Let be a triplet achieving . Then the equality in the inequality (2) holds only if
[TABLE]
Since the composition of followed by the measurement is a linear, positive, and probability preserving map, there is a transition probability such that
[TABLE]
and
[TABLE]
where is delta distribution at . Therefore, by Lemma 1, provided .
Define
[TABLE]
Then if , observe , and
[TABLE]
Therefore, supports of positive operators
[TABLE]
are non - overlapping with each other.
Therefore, the assertion follows since
[TABLE]
This theorem means that the classical information embedded into non-orthogonal states cannot be recovered completely by any measurement. At first glance, the statement seems almost trivial, but in the proof we fully exploit the fact that is strictly convex, and in fact, is not true if the information measure is total variation distance.
4 Total variation distance
4.1 Set up and a general formula
Total variation distance, or the divergence corresponding to , is one of most frequently used distance measures between two probability distributions. Its most common quantum version is
[TABLE]
where is the distribution of the outcome of the measurement under . Obviously,
[TABLE]
Given a triple of , we define , where are probability distributions on :
[TABLE]
where
[TABLE]
Then satisfies (1) and .
(Intuitively, takes care of the common part of two states, and and compensates the reminder.)
Therefore, without loss of generality, we may restrict ourselves to the one in the form of (3), where is an operator with
[TABLE]
Therefore, we have:
[TABLE]
4.2 Reversibility
In this subsection and the next, suppose . We study the conditions for
[TABLE]
This implies that any quantum version of statistical distance equals to . Intuitively, this means classical statistical distance encoded into quantum states can be completely retrieved. As stated, such complete retrieval of - divergence scarcely occurs if is operator convex and and do not commute. The statistical distance is very different from - divergence induced by an operator convex function in this respect.
If we drop the constraint and suppose ,
[TABLE]
Here, the minimum in the third line is achieved if . ( is the positive part of the self-adjoint operator .)
Therefore, (5) holds iff
[TABLE]
(Here, .) Another necessary and sufficient condition is the existence of , , with
[TABLE]
To see this, observe
[TABLE]
For (5) to hold, existence of , with is necessary and sufficient. Thus .
Of course, in general, (6) is not true. For example, if is a pure state,
[TABLE]
where and denotes its generalized inverse [2].
However, if and are very close so that
[TABLE]
it is true.
Another sufficient condition is
[TABLE]
To see this is sufficient, take the square root of both sides of inequality: then we obtain (6). (Recall is operator monotone. This condition is not necessary, since is not operator monotone.) Rearranging the terms, we have
[TABLE]
4.3 2 - dimensional case
In this subsection, we assume and , and compute the set for each fixed , using the necessary and sufficient condition given by (7) and (8). As it turns out, this set is the spheroid, with focal points and , and touching to the surface of Bloch sphere at each end of the longest axis.
Since ,
[TABLE]
and
[TABLE]
Let , , , , and be the Bloch vector of , , , , and , respectively. Also, (8) holds iff and are rank - 1 and Therefore, by (7),
[TABLE]
Therefore,
[TABLE]
Let denote the Euclid norm in , and
[TABLE]
The set is fairly large. For example, if the largest eigenvalue of is , this occupies more than the half of the volume of the Bloch sphere.
If
[TABLE]
the minimization problem (4) is solved explicitly. With , , . Thus, if satisfies constrains of (4), so does , and . Therefore, without loss of generality, we suppose is diagonal. After some elementary analysis, the optimal turns out to be
[TABLE]
and we have
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Fumio Hiai, Milan Mosonyi, Different quantum f-divergences and the reversibility of quantum operations,Reviews in Mathematical Physics, Volume No.29, Issue No. 07, (2017)
- 2[2] K. Matsumoto, ”A new quantum version of f-divergence,” ar Xiv:1311.4722 (2003)
