Resource theory of asymmetric distinguishability for quantum channels

Xin Wang; Mark M. Wilde

arXiv:1907.06306·quant-ph·December 18, 2019

Resource theory of asymmetric distinguishability for quantum channels

Xin Wang, Mark M. Wilde

PDF

TL;DR

This paper develops a resource theory for the asymmetric distinguishability of quantum channels, providing operational interpretations for related entropic quantities and analyzing transformation tasks in various regimes.

Contribution

It generalizes the resource theory of state distinguishability to channels, introduces operational meanings for channel min- and max-relative entropies, and analyzes asymptotic and one-shot transformation costs.

Findings

01

Optimal one-shot distinguishability measures relate to smooth channel relative entropies.

02

Asymptotic distinguishability cost equals channel max-relative entropy.

03

Distillable distinguishability equals amortized channel relative entropy.

Abstract

This paper develops the resource theory of asymmetric distinguishability for quantum channels, generalizing the related resource theory for states [arXiv:1010.1030; arXiv:1905.11629]. The key constituents of the channel resource theory are quantum channel boxes, consisting of a pair of quantum channels, which can be manipulated for free by means of an arbitrary quantum superchannel (the most general physical transformation of a quantum channel). One main question of the resource theory is the approximate channel box transformation problem, in which the goal is to transform an initial channel box (or boxes) to a final channel box (or boxes), while allowing for an asymmetric error in the transformation. The channel resource theory is richer than its counterpart for states because there is a wider variety of ways in which this question can be framed, either in the one-shot or $n$ -shot…

Equations728

(N, M),

(N, M),

(P_{R B \to S} (N_{A \to B} (ρ_{R A})), P_{R B \to S} (M_{A \to B} (ρ_{R A}))) .

(P_{R B \to S} (N_{A \to B} (ρ_{R A})), P_{R B \to S} (M_{A \to B} (ρ_{R A}))) .

N_{A \to B} (ω_{A})

N_{A \to B} (ω_{A})

M_{A \to B} (ω_{A})

N_{A \to B} (ω_{A})

N_{A \to B} (ω_{A})

M_{A \to B} (ω_{A})

(\mathcal{N}_{A\rightarrow B},\mathcal{M}_{A\rightarrow B})\rightarrow(\mathcal{N}_{A\rightarrow B}(\omega_{A}),\mathcal{M}_{A\rightarrow B}(\omega_{A}))\

(\mathcal{N}_{A\rightarrow B},\mathcal{M}_{A\rightarrow B})\rightarrow(\mathcal{N}_{A\rightarrow B}(\omega_{A}),\mathcal{M}_{A\rightarrow B}(\omega_{A}))\

(ρ_{E}, σ_{E})

(ρ_{E}, σ_{E})

\to (P_{A E \to B} (ω_{A} \otimes ρ_{E}), P_{A E \to B} (ω_{A} \otimes σ_{E}))

= (N_{A \to B} (ω_{A}), M_{A \to B} (ω_{A})),

P_{A E \to B} (ζ_{A E}) = N_{A \to B} (⟨ 0 ∣_{E} ζ_{A E} ∣0 ⟩_{E}) + M_{A \to B} (⟨ 1 ∣_{E} ζ_{A E} ∣1 ⟩_{E}) .

P_{A E \to B} (ζ_{A E}) = N_{A \to B} (⟨ 0 ∣_{E} ζ_{A E} ∣0 ⟩_{E}) + M_{A \to B} (⟨ 1 ∣_{E} ζ_{A E} ∣1 ⟩_{E}) .

D_{R B \to E} (N_{A \to B} (τ_{R A}))

D_{R B \to E} (N_{A \to B} (τ_{R A}))

D_{R B \to E} (M_{A \to B} (τ_{R A}))

(N_{A \to B}, M_{A \to B}) \leftrightarrow (ρ_{E}, σ_{E}),

(N_{A \to B}, M_{A \to B}) \leftrightarrow (ρ_{E}, σ_{E}),

Θ_{(A \to B) \to (C \to D)} (N_{A \to B}) = K_{C \to D} .

Θ_{(A \to B) \to (C \to D)} (N_{A \to B}) = K_{C \to D} .

(id_{(R) \to (R)} \otimes Θ_{(A \to B) \to (C \to D)}) (M_{R A \to R B}),

(id_{(R) \to (R)} \otimes Θ_{(A \to B) \to (C \to D)}) (M_{R A \to R B}),

Θ_{(A \to B) \to (C \to D)} (N_{A \to B}) = D_{B M \to D} \circ N_{A \to B} \circ E_{C \to A M},

Θ_{(A \to B) \to (C \to D)} (N_{A \to B}) = D_{B M \to D} \circ N_{A \to B} \circ E_{C \to A M},

(N_{A \to B}, M_{A \to B}) Θ (K_{C \to D}, L_{C \to D}),

(N_{A \to B}, M_{A \to B}) Θ (K_{C \to D}, L_{C \to D}),

Θ_{(A \to B) \to (C \to D)} (N_{A \to B})

Θ_{(A \to B) \to (C \to D)} (N_{A \to B})

Θ_{(A \to B) \to (C \to D)} (M_{A \to B})

ε ((N, M) \to (K, L)) := Θ \in SC in f {ε \in [0, 1] : Θ (N) \approx_{ε} K, Θ (M) = L},

ε ((N, M) \to (K, L)) := Θ \in SC in f {ε \in [0, 1] : Θ (N) \approx_{ε} K, Θ (M) = L},

N^{1} \approx_{ε} N^{2} ⟺ \frac{1}{2} N^{1} - N^{2}_{⋄} \leq ε .

N^{1} \approx_{ε} N^{2} ⟺ \frac{1}{2} N^{1} - N^{2}_{⋄} \leq ε .

∥ P_{A \to B} ∥_{⋄} := ρ_{R A} sup ∥ P_{A \to B} (ρ_{R A}) ∥_{1},

∥ P_{A \to B} ∥_{⋄} := ρ_{R A} sup ∥ P_{A \to B} (ρ_{R A}) ∥_{1},

∥ P_{A \to B} ∥_{⋄} := ψ_{R A} sup ∥ P_{A \to B} (ψ_{R A}) ∥_{1},

∥ P_{A \to B} ∥_{⋄} := ψ_{R A} sup ∥ P_{A \to B} (ψ_{R A}) ∥_{1},

ρ_{R A}, Λ_{R B} sup ∣ Tr [Λ_{R B} N_{A \to B} (ρ_{R A})] - Tr [Λ_{R B} M_{A \to B} (ρ_{R A})] ∣ \leq ε,

ρ_{R A}, Λ_{R B} sup ∣ Tr [Λ_{R B} N_{A \to B} (ρ_{R A})] - Tr [Λ_{R B} M_{A \to B} (ρ_{R A})] ∣ \leq ε,

ρ_{R A}, Λ_{R B} sup ∣ Tr [Λ_{R B} (N_{A \to B} - M_{A \to B}) (ρ_{R A})] ∣

ρ_{R A}, Λ_{R B} sup ∣ Tr [Λ_{R B} (N_{A \to B} - M_{A \to B}) (ρ_{R A})] ∣

= ρ_{R A} sup \frac{1}{2} ∥ N_{A \to B} (ρ_{R A}) - M_{A \to B} (ρ_{R A}) ∥_{1}

= \frac{1}{2} ∥ N - M ∥_{⋄},

(R_{∣0 ⟩ ⟨ 0∣}, R_{π_{M}}),

(R_{∣0 ⟩ ⟨ 0∣}, R_{π_{M}}),

R_{σ} (ρ) = Tr [ρ] σ,

R_{σ} (ρ) = Tr [ρ] σ,

π_{M} := \frac{1}{M} ∣0 ⟩ ⟨ 0∣ + (1 - \frac{1}{M}) ∣1 ⟩ ⟨ 1∣,

π_{M} := \frac{1}{M} ∣0 ⟩ ⟨ 0∣ + (1 - \frac{1}{M}) ∣1 ⟩ ⟨ 1∣,

D_{d}^{0} (N, M) := lo g_{2} Θ \in SC sup {M : Θ (N) = R_{∣0 ⟩ ⟨ 0∣}, Θ (M) = R_{π_{M}}} .

D_{d}^{0} (N, M) := lo g_{2} Θ \in SC sup {M : Θ (N) = R_{∣0 ⟩ ⟨ 0∣}, Θ (M) = R_{π_{M}}} .

D_{c}^{0} (N, M) := lo g_{2} Θ \in SC in f {M : Θ (R_{∣0 ⟩ ⟨ 0∣}) = N, Θ (R_{π_{M}}) = M} .

D_{c}^{0} (N, M) := lo g_{2} Θ \in SC in f {M : Θ (R_{∣0 ⟩ ⟨ 0∣}) = N, Θ (R_{π_{M}}) = M} .

D_{m i n} (N ∥ M) := ψ_{R A} sup D_{m i n} (N_{A \to B} (ψ_{R A}) ∥ M_{A \to B} (ψ_{R A})),

D_{m i n} (N ∥ M) := ψ_{R A} sup D_{m i n} (N_{A \to B} (ψ_{R A}) ∥ M_{A \to B} (ψ_{R A})),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Resource theory of asymmetric distinguishability for quantum channels

Xin Wang

[email protected]

Joint Center for Quantum Information and Computer Science, University of Maryland, College Park, Maryland 20742, USA

Institute for Quantum Computing, Baidu Research, Beijing 100193, China

Mark M. Wilde

[email protected]

Hearne Institute for Theoretical Physics, Department of Physics and Astronomy, Center for Computation and Technology, Louisiana State University, Baton Rouge, Louisiana 70803, USA

Abstract

This paper develops the resource theory of asymmetric distinguishability for quantum channels, generalizing the related resource theory for states [Matsumoto, arXiv:1010.1030; Wang and Wilde, Phys. Rev. Research 1, 033170 (2019)]. The key constituents of the channel resource theory are quantum channel boxes, consisting of a pair of quantum channels, which can be manipulated for free by means of an arbitrary quantum superchannel (the most general physical transformation of a quantum channel). One main question of the resource theory is the approximate channel box transformation problem, in which the goal is to transform an initial channel box (or boxes) to a final channel box (or boxes), while allowing for an asymmetric error in the transformation. The channel resource theory is richer than its counterpart for states because there is a wider variety of ways in which this question can be framed, either in the one-shot or $n$ -shot regimes, with the latter having parallel and sequential variants. As in our prior work [Wang and Wilde, Phys. Rev. Research 1, 033170 (2019)], we consider two special cases of the general channel box transformation problem, known as distinguishability distillation and dilution. For the one-shot case, we find that the optimal values of the various tasks are equal to the non-smooth or smooth channel min- or max-relative entropies, thus endowing all of these quantities with operational interpretations. In the asymptotic sequential setting, we prove that the exact distinguishability cost is equal to the channel max-relative entropy and the distillable distinguishability is equal to the amortized channel relative entropy of [arXiv:1808.01498]. This latter result can also be understood as a solution to Stein’s lemma for quantum channels in the sequential setting. Finally, the theory simplifies significantly for environment-seizable and classical–quantum channel boxes.

††preprint:

LABEL:FirstPage1 LABEL:LastPage#110

I Introduction

In many scientific fields of interest, distinguishability is an important concept. More generally, it can be considered as a resource in that it allows for making decisions, and furthermore, the more distinguishable that two possibilities are, the easier and faster it is to make a decision.

In a recent paper, we formalized the notion of distinguishability as a resource by developing the resource theory of asymmetric distinguishability in detail WW (19), following the original proposal from Mat (10, 11). This resource theory demonstrates that distinguishability is truly a fundamental resource that can be manipulated and interconverted into different forms. The benefit of developing this resource theory is that, not only can fundamental tasks such as quantum hypothesis testing HP (91); ON (00); Hay (03, 04); WR (12); Hay (17) be recast into an intuitive approach based on resource-theoretic thinking, but also new information processing tasks emerge, such as distinguishability dilution, which is related to concepts such as simulation and synthesis of quantum states. The present paper illustrates further benefits of the resource-theoretic approach by using it to solve some outstanding questions in the theory of quantum channel discrimination.

In the resource theory of asymmetric distinguishability for states WW (19), the basic object to be manipulated is a quantum “box” $(\rho,\sigma)$ consisting of two quantum states $\rho$ and $\sigma$ . The descriptor “asymmetric” applies to this resource theory because it allows for a slight error in the transformation of the first state of the box, while not allowing for any error in the transformation of the second state of the box. One basic task is to distill as many bits of asymmetric distinguishability as possible from this box by processing it with an arbitrary quantum channel WW (19). Another basic task is to dilute bits of asymmetric distinguishability to prepare the box $(\rho,\sigma)$ , with the goal being to use as few bits of asymmetric distinguishability as possible in order to do so WW (19). These tasks give operational meaning to fundamental entropic measures such as the min-relative entropy Dat (09), the smooth min-relative entropy BD (10, 11); WR (12), the max-relative entropy Dat (09), and the smooth max-relative entropy Dat (09). One of the core results for this resource theory is that it is reversible, and the fundamental rate of interconversion is characterized by the quantum relative entropy WW (19).

The main goal of the present paper is to generalize these concepts from quantum states to quantum channels, given the prominent role of the latter in quantum information and beyond. We note here that recently there has been much effort more generally in extending concepts from resource theories of quantum states to resource theories for quantum channels (see, e.g., BHLS (03); BGMW (17); DBW (17); GFW*+* (18); LBL (18); BDW (18); TR (19); TEZP (19); WW (18); SC (19); WWS (19); LY (19); LW (19); YLZ*+* (19)). In the resource theory of asymmetric distinguishability for quantum channels presented here, the basic object to be manipulated is a quantum channel box $(\mathcal{N},\mathcal{M})$ , which consists of two quantum channels $\mathcal{N}$ and $\mathcal{M}$ . The idea is that the input and output ports of the channel box are accessible to an agent in the resource theory, while the particular choice of the channel is unknown to the agent. A key difference between this resource theory and the former one for quantum states is that a quantum channel can be probed by means of both an input port and an output port, which implies that the way they are manipulated is by means of a quantum superchannel CDP08b . As a simple example of a superchannel, consider that the encoding and decoding, i.e., pre- and post-processing, of a channel commonly employed in quantum Shannon theory Wil (17) realize a physical transformation of a channel. The incorporation of superchannels into the resource theory implies that the channel resource theory is more involved than it is for boxes consisting only of states.

More generally, we allow for quantum strategy boxes GW (07); CDP08a ; CDP (09); Gut (09, 12); GRS (18) and manipulate them by means of general physical transformations CDP (09) that take quantum strategies to other quantum strategies (note that quantum strategies are in one-to-one correspondence with quantum combs CDP (09)). By the results of CDP (09), such physical transformations are in fact quantum strategies themselves, so that our generalization of the resource theory to quantum strategies is a significant generalization.

We consider several fundamental tasks in this resource theory, which can be understood as extensions of the tasks considered in WW (19). The first basic one is distinguishability distillation, in which the goal is to distill as many bits of asymmetric distinguishability as possible from a single channel box in the one-shot setting, or multiple channel boxes in the $n$ -shot setting. This task is intimately related to asymmetric hypothesis testing for quantum channels CMW (16) (see Hay (09) for the classical case), which is a particular kind of quantum channel discrimination. For this task, there are a variety of possibilities to consider, including the one-shot case and the $n$ -shot case, in the latter using either a parallel or sequential strategy GW (07); CDP08a ; Gut (09); DFY (09); HHLW (10); Gut (12); CMW (16). We also consider this task for quantum strategy boxes. Another basic task of interest is distinguishability dilution, in which the goal is to dilute bits of asymmetric distinguishability to a single or multiple channel boxes, using as few bits of asymmetric distinguishability as possible. This task also has a variety of possibilities, including one- and $n$ -shot, the latter having parallel and sequential variants as well. We likewise consider this task for quantum strategy boxes. This task is also intimately related to quantum channel simulation BSST (02); BDH*+* (14); BCR (11); BBCW (13); BRW (14); Ber (13); BGMW (17); GFW*+* (18); FBB (19); FWTB (19); Wil (18), but here takes on a specific form due to the structure of the resource theory of asymmetric distinguishability.

One of the major tasks in this resource theory is to convert one channel box to another, doing so either exactly or approximately. As a variant of this problem, another task is to determine the rate at which it is possible to convert $n$ channel boxes, with each box consisting of the same pair of channels, to $m$ boxes consisting of another pair of channels, when $n$ is allowed to be arbitrarily large. More generally, we consider the conversion of an $n$ -round quantum strategy box to an $m$ -round strategy box. The simpler transformation problem for state boxes was solved in WW (19) and is relevant for addressing the channel box transformation problem for particular channel boxes that are environment-seizable, as defined in BHKW (18).

II Summary of results

We now summarize the main contributions and results of our paper:

We establish the resource theory of asymmetric distinguishability for quantum channels, with the basic objects being quantum channel boxes, the free operations to manipulate them being quantum superchannels CDP08b , and the basic units of currency being bits of asymmetric distinguishability (see Section III). Later we accomplish the same for quantum strategy boxes, with the free operations to manipulate them being quantum strategies (see Section VIII). 2. 2.

We prove that the approximate channel box transformation problem is characterized by a semi-definite program and thus can be calculated efficiently with respect to the input and output dimensions of the channels (see Section IV). 3. 3.

The exact one-shot distillable distinguishability of a quantum channel box is equal to the channel min-relative entropy, which is a particular case of the generalized channel divergence of CMW (16); LKDW (18). The exact one-shot distinguishability cost of a quantum channel box is equal to the channel max-relative entropy, which is a particular case of the generalized channel divergence of CMW (16); LKDW (18) and explored in more detail in GFW*+* (18); BHKW (18). See Section V.1 for both of these results. 4. 4.

The approximate one-shot distillable distinguishability of a quantum channel box is equal to the smooth channel min-relative entropy of CMW (16), the latter also known as channel hypothesis testing relative entropy CMW (16). The approximate one-shot distinguishability cost of a quantum channel box is equal to the smooth channel max-relative entropy, again a particular case of the generalized channel divergence of LKDW (18) and explored in more detail in GFW*+* (18). See Section V.2 for both of these results. 5. 5.

We consider asymptotic parallel versions of the above tasks in Section VI. We find that the exact distillable distinguishability is given by the regularized channel min-relative entropy (see Section VI.1). By means of an example from Aci (01), we conclude that the regularization seems to be necessary because the channel min-relative entropy is highly non-additive. We then prove that the exact distinguishability cost is equal to the channel max-relative entropy (see Section VI.2). The distillable distinguishability is equal to the regularized channel relative entropy (see Section VI.3), and the same quantity is a lower bound on the distinguishability cost (see Section VI.4). These latter operational tasks simplify for both environment-seizable and classical–quantum channel boxes. 6. 6.

Section VII considers the asymptotic parallel version of the general channel box transformation problem, giving basic definitions and some bounds that apply to this case. Again, the results simplify for the case of environment-seizable and classical–quantum channel boxes. 7. 7.

Section VIII considers the quantum strategy box transformation problem. To begin with, this section introduces the generalized quantum strategy divergence as a generalization of the strategy distance of CDP08a ; CDP (09); Gut (12) and establishes a data processing inequality for this distinguishability measure. The section then establishes several bounds on how well one can perform a physical transformation from one strategy box to another strategy box. All of the results apply to sequential channel boxes because these are special cases of strategy boxes. Furthermore, we consider an asymptotic version of the box transformation problem for sequential channel boxes and prove concrete results for environment-seizable and classical–quantum channel boxes. 8. 8.

We then consider distillation and dilution of strategy boxes in Section IX. Our key results here, specialized to sequential channel boxes, include single-letter formulas for the asymptotic exact sequential distinguishability cost and the asymptotic sequential distillable distinguishability, expressed respectively as the channel max-relative entropy and the amortized channel relative entropy of BHKW (18), giving these quantities fundamental operational interpretations in the resource theory of asymmetric distinguishability. The latter result can be alternatively understood as a solution to Stein’s lemma for quantum channels in the sequential setting.

In the rest of the paper, we discuss details of the resource theory of asymmetric distinguishability for quantum channels, as well as the contributions listed above.

III Resource theory of asymmetric distinguishability for quantum

channels

We begin by generalizing the resource theory of asymmetric distinguishability from WW (19) to the setting of quantum channels, by considering a channel box of the following form:

[TABLE]

where $\mathcal{N}$ and $\mathcal{M}$ are quantum channels, each acting on an input system $A$ and outputting a system $B$ . Recall that a quantum channel is a completely positive, trace-preserving (CPTP) map. We also write these as $\mathcal{N}_{A\rightarrow B}$ and $\mathcal{M}_{A\rightarrow B}$ in what follows in order to indicate the input and output systems explicitly. The channel box generalizes the state box $(\rho,\sigma)$ from WW (19), which consists of a pair of quantum states $\rho$ and $\sigma$ . In fact, a state box is a special case of a channel box in which the input systems are trivial.

One interpretation of the channel box in (1) is that a distinguisher is allowed to prepare any state $\rho_{RA}$ of a reference system $R$ and the channel input $A$ , either the channel $\mathcal{N}_{A\rightarrow B}$ or $\mathcal{M}_{A\rightarrow B}$ is applied, and then the distinguisher is allowed to perform any post-processing on the reference $R$ and the channel output $B$ in order to decide which channel was applied. That is, by inputting an arbitrary state $\rho_{RA}$ to the channel box (pre-processing) and then applying the channel $\mathcal{P}_{RB\rightarrow S}$ (post-processing), one can transform it to the following state box:

[TABLE]

More generally, the agent who has access to the channel box in (1) can perform a quantum superchannel CDP08b on it in order to transform it to another channel box, as discussed in Section III.2 below.

As stated earlier, the channel box in (1) indeed generalizes the state box $(\rho,\sigma)$ considered previously in WW (19). Another way of seeing this is to take the channels $\mathcal{N}$ and $\mathcal{M}$ in (1) to be replacer channels with the following action:

[TABLE]

Then no matter what state $\tau_{RA}$ is input to the channel box $(\mathcal{N},\mathcal{M})$ , it reduces to the state box $(\tau_{R}\otimes\rho_{A},\tau_{R}\otimes\sigma_{B})$ , which, by the discussion in (WW, 19, Section III), is equivalent by a free operation to the state box $(\rho_{B},\sigma_{B})$ .

III.1 Environment-parametrized and -seizable channels

Other simple classes of channel boxes that are strongly related to state boxes, generalizing the above example of a replacer channel box in (3)–(4), include those that are environment parametrized TW (16) and the subclass of environment-seizable channel boxes BHKW (18). Note that environment-parametrized channel boxes are related to programmable channels NC (97); DP (05).

A channel box $(\mathcal{N}_{A\rightarrow B},\mathcal{M}_{A\rightarrow B})$ is environment parametrized with associated environment states $\rho_{E}$ and $\sigma_{E}$ if there exists a common interaction channel $\mathcal{P}_{AE\rightarrow B}$ such that

[TABLE]

for all inputs $\omega_{A}$ TW (16). In this way, any pre-processing of an environment-parametrized channel box as

[TABLE]

can be viewed as a postprocessing of the state box $(\rho_{E},\sigma_{E})$ , via

[TABLE]

so that the distinguishability of the channel box $(\mathcal{N},\mathcal{M})$ is always limited by that of the state box $(\rho_{E},\sigma_{E})$ , as observed in TW (16) (see JWD*+* (08); DDM (14) for related observations in quantum estimation theory).

We should emphasize that an arbitrary channel box $(\mathcal{N},\mathcal{M})$ is environment-parametrized with associated environment states that are orthogonal DW (19). That is, we can set $\rho_{E}=|0\rangle\langle 0|_{E}$ and $\sigma_{E}=|1\rangle\langle 1|_{E}$ and the common interaction channel $\mathcal{P}_{AE\to B}$ as

[TABLE]

In this way, the channels $\mathcal{N}_{A\to B}$ and $\mathcal{M}_{A\to B}$ are realized as in (5)–(6), by starting from the state box $(|0\rangle\langle 0|_{E},|1\rangle\langle 1|_{E})$ and applying the common interaction channel $\mathcal{P}_{AE\to B}$ in (9). However, this realization of the channels is the least efficient from the perspective of the resource theory of asymmetric distinguishability, because a state box consisting of a pair of orthogonal states is equivalent to an infinite number of bits of asymmetric distinguishability WW (19). (See WW (19) for the notion of bits of asymmetric distinguishability, and Section V for this notion in the channel resource theory.) In this sense, the realization of an arbitrary channel box in the above way is trivial because it requires an infinite number of bits of asymmetric distinguishability in order to do so. The concept of environment-parametrized channel boxes becomes non-trivial when the background environment states have finite distinguishability, when measured according to some divergence, so that the channel box can be realized starting from a finite number of bits of asymmetric distinguishability.

Environment-seizable channel boxes are defined to be environment-parametrized with associated environment states $\rho_{E}$ and $\sigma_{E}$ and additionally have the property that it is possible to find a common pre- and post-processing of the channel box $(\mathcal{N}_{A\rightarrow B},\mathcal{M}_{A\rightarrow B})$ to retrieve the state box $(\rho_{E},\sigma_{E})$ from it BHKW (18). That is, for environment-seizable channels, there exists a common input state $\tau_{RA}$ and a common post-processing channel $\mathcal{D}_{RB\rightarrow E}$ such that

[TABLE]

In this way, we have the following equivalence for environment-seizable channels:

[TABLE]

with the direction $\leftarrow$ of the equivalence following from (8) and the other direction $\rightarrow$ following from the seizable property. Thus, environment-seizable channel boxes represent a broader generalization of state boxes than do channel boxes consisting of replacer channels. Furthermore, environment-seizable channel boxes are fully identified with the background environment states $\rho_{E}$ and $\sigma_{E}$ in the above sense. As we show later, and as observed in earlier work TW (16); BHKW (18), the equivalence in (12) simplifies the resource theory of asymmetric distinguishability significantly for environment-seizable channel boxes. Finally, several examples of environment-seizable channel boxes were presented in BHKW (18), and the notion of environment-seizable channel boxes is related to the notion of resource-seizable channels from Wil (18).

III.2 Superchannels as transformations of channel boxes

The most general physical transformation allowed on a channel box is a superchannel $\Theta$ , which is a quantum physical transformation of channels CDP08b . That is, a superchannel is a linear map that preserves the set of quantum channels, even when the quantum channel is an arbitrary bipartite channel with external input and output systems that are arbitrarily large. In this sense, superchannels are completely CPTP preserving. Note that the terminology “superchannel” was introduced in Gou (19).

To see this, a superchannel $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ takes as input a quantum channel $\mathcal{N}_{A\rightarrow B}$ and outputs a quantum channel $\mathcal{K}_{C\rightarrow D}$ , which we denote by

[TABLE]

The superchannel $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ is completely CPTP preserving in the sense that the following output channel

[TABLE]

is a CPTP map for all input quantum channels $\mathcal{M}_{RA\rightarrow RB}$ , where $\operatorname{id}_{\left(R\right)\rightarrow\left(R\right)}$ denotes the identity superchannel CDP08b .

One of the fundamental theorems of superchannels is that each superchannel $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ has a physical realization as a pre- and post-processing of the channel $\mathcal{N}_{A\rightarrow B}$ along with a quantum memory system:

[TABLE]

where $\mathcal{E}_{C\rightarrow AM}$ and $\mathcal{D}_{BM\rightarrow D}$ are pre- and post-processing quantum channels, respectively CDP08b . This transformation is depicted in Figure 1.

IV General channel box transformation problem

We can now state one main problem for the resource theory of asymmetric distinguishability for quantum channels, which we call the channel box transformation problem. The goal of this problem is to determine, for an input channel box $(\mathcal{N}_{A\rightarrow B},\mathcal{M}_{A\rightarrow B})$ and an output channel box $(\mathcal{K}_{C\rightarrow D},\mathcal{L}_{C\rightarrow D})$ , whether there exists a superchannel $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ such that the following transformation is possible:

[TABLE]

where the notation means that the following equations should be satisfied

[TABLE]

This problem was introduced and solved in Gou (19), in the sense that the answer to this question can be determined by means of a semi-definite program or by employing the extended conditional min-entropy and a quantum dynamic generalization of majorization. The problem there was called “comparison of quantum channels.” Note that the simpler problem regarding transformation of state boxes via a common quantum channel has a long history, having been considered extensively both in classical and quantum information theory Bla (53); AU (80); CJW (04); MOA (11); Bus (12); HJRW (12); BDS (14); BaHN*+* (15); Ren (16); BD (16); Bus (16); GJB*+* (18); Bus (17); BG (17).

In many cases of interest, the transformation in (16) is simply not possible. Thus, it is sensible to modify the problem to allow for approximation, and the way that we do so is consistent with how we did so for the related problem in the resource theory of asymmetric distinguishability for states WW (19). Namely, we allow for an approximation error in the transformation of the first channel in the box, but we demand that the second channel be simulated exactly (hence the descriptor “asymmetric” in “resource theory of asymmetric distinguishability”). Mathematically, this corresponds to the following optimization problem:

[TABLE]

where SC denotes the set of superchannels and the shorthand $\mathcal{N}^{1}\approx_{\varepsilon}\mathcal{N}^{2}$ for channels $\mathcal{N}^{1}$ and $\mathcal{N}^{2}$ is defined as follows:

[TABLE]

In the above, $\left\|\mathcal{P}_{A\rightarrow B}\right\|_{\diamond}$ denotes the diamond norm Kit (97) of a Hermiticity-preserving map $\mathcal{P}_{A\rightarrow B}$ , defined as

[TABLE]

where the optimization is with respect to quantum states $\rho_{RA}$ and the reference system $R$ can be arbitrarily large. However, note that the following significant simplification holds

[TABLE]

where the optimization is with respect to pure-state inputs $\psi_{RA}$ with the reference system $R$ isomorphic to the input system $A$ .

Why do we adopt the diamond norm to measure the distance between two quantum channels $\mathcal{N}_{A\rightarrow B}$ and $\mathcal{M}_{A\rightarrow B}$ ? Related, how should we assess the performance of a quantum information processing protocol in which the ideal channel to be simulated is $\mathcal{N}_{A\rightarrow B}$ but the channel realized in practice is $\mathcal{M}_{A\rightarrow B}$ ? Suppose that a third party is trying to assess how distinguishable the actual channel $\mathcal{M}_{A\rightarrow B}$ is from the ideal channel $\mathcal{N}_{A\rightarrow B}$ . Such an individual has access to both the input and output ports of the channel, and so the most general strategy for the distinguisher to employ is to prepare a state $\rho_{RA}$ of a reference system $R$ and the channel input system $A$ . The distinguisher transmits the $A$ system of $\rho_{RA}$ into the unknown channel. After that, the distinguisher receives the channel output system $B$ and then performs a measurement described by the POVM $\{\Lambda_{RB}^{x}\}_{x}$ on the reference system $R$ and the channel output system $B$ . The probability of obtaining a particular outcome $\Lambda_{RB}^{x}$ is given by the Born rule. In the case that the unknown channel is $\mathcal{N}_{A\rightarrow B}$ , this probability is $\operatorname{Tr}[\Lambda_{RB}^{x}\mathcal{N}_{A\rightarrow B}(\rho_{RA})]$ , and in the case that the unknown channel is $\mathcal{M}_{A\rightarrow B}$ , this probability is $\operatorname{Tr}[\Lambda_{RB}^{x}\mathcal{M}_{A\rightarrow B}(\rho_{RA})]$ . What we demand is that the deviation between the two probabilities $\operatorname{Tr}[\Lambda_{RB}^{x}\mathcal{N}_{A\rightarrow B}(\rho_{RA})]$ and $\operatorname{Tr}[\Lambda_{RB}^{x}\mathcal{M}_{A\rightarrow B}(\rho_{RA})]$ is no larger than some tolerance $\varepsilon$ . Since this should be the case for all possible input states and measurement outcomes, what we demand mathematically is that

[TABLE]

where $\rho_{RA}\geq 0$ , $\operatorname{Tr}[\rho_{RA}]=1$ , and $0\leq\Lambda_{RB}\leq I_{RB}$ . As a consequence of a well known characterization of trace distance from Hel (69, 76), we have that

[TABLE]

where $\frac{1}{2}\left\|\mathcal{N}-\mathcal{M}\right\|_{\diamond}$ is the normalized diamond distance between $\mathcal{N}$ and $\mathcal{M}$ . This indicates that if $\frac{1}{2}\left\|\mathcal{N}-\mathcal{M}\right\|_{\diamond}\leq\varepsilon$ , then the deviation between probabilities for any possible input state and measurement operator never exceeds $\varepsilon$ , so that the approximation between quantum channels $\mathcal{N}_{A\rightarrow B}$ and $\mathcal{M}_{A\rightarrow B}$ is naturally quantified by the normalized diamond distance $\frac{1}{2}\left\|\mathcal{N}-\mathcal{M}\right\|_{\diamond}$ . We note that related interpretations of the diamond distance of channels have been given in KW (04); RW (05); GLN (05).

As we indicated above, the approximate channel box transformation problem is fundamental to the resource theory of asymmetric distinguishability, indicating exactly how well one can convert channel boxes. It captures distinguishability in a fundamental way: as pointed out in Gou (19), a necessary condition for a transformation to be possible exactly is if the two channels in one channel box are more distinguishable than the two channels in the target channel box, as quantified by a channel divergence LKDW (18). Thinking along the lines of BaHN*+* (15), these kinds of limitations from channel divergences can be interpreted as “second laws” for distinguishability that draw the line between the possible and impossible. As these lines might be too sharp for practical purposes (i.e., if the transformation were to be possible with small error), then it is sensible to consider the relaxation presented in (19). Furthermore, generalizations of the approximate box transformation problem will have applications in other resource theories of channels, such as entanglement, thermodynamics, purity, magic, etc., and therein can also be interpreted as second laws or approximate second laws.

In Appendix B, we show that the optimization in (19) for the approximate channel box transformation problem can be calculated by a semi-definite program, and thus can be efficiently solved, where the complexity of the problem is polynomial in the dimension of the inputs and outputs of the channels $(\mathcal{N}_{A\rightarrow B},\mathcal{M}_{A\rightarrow B})$ and $(\mathcal{K}_{C\rightarrow D},\mathcal{L}_{C\rightarrow D})$ . This result generalizes the recent finding in Gou (19) mentioned after (16) above.

V One-shot distillation and dilution of quantum channel boxes

Another way of addressing the general approximate channel box transformation problem, which is helpful for considering asymptotic versions of the problem, is to break it into two steps, as was done in WW (19) for the case of states. Namely, one can first distill a standard channel box from the original one, and then dilute this standard channel box to the final target one. In this work, we take the standard channel box to be the following one:

[TABLE]

where $\mathcal{R}_{\sigma}$ denotes a replacer channel, which has the following action on an arbitrary input $\rho$ :

[TABLE]

which is simply to discard the input $\rho$ and replace it with a state $\sigma$ . Also, the state $\pi_{M}$ is defined as

[TABLE]

for $M\geq 1$ . Our interpretation of the channel box in (26) is that it contains $\log_{2}M$ bits of asymmetric distinguishability. Since the replacer channel box in (26) is equivalent to the state channel box $(|0\rangle\langle 0|,\pi_{M})$ , this interpretation is consistent with the interpretation given in WW (19).

V.1 Exact one-shot distillation and dilution of quantum channel boxes

A primary goal in this setting is the task of exact distillation of as many bits of asymmetric distinguishability as possible, which is similar to the task for states considered in WW (19), but instead we allow for the most general processing of the channel box according to a superchannel. Mathematically, we can phrase this problem as the following optimization:

[TABLE]

We also consider exact dilution of the channel box, starting from as few bits of asymmetric distinguishability as possible. The requirement here is to convert bits of asymmetric distinguishability by the action of a common superchannel to the channel box $(\mathcal{N},\mathcal{M})$ exactly, in such a way that the number of bits $\log_{2}M$ of asymmetric distinguishability is as small as possible. Mathematically, this corresponds to the following optimization problem:

[TABLE]

Let $D_{\min}(\mathcal{N}\|\mathcal{M})$ denote the channel min-relative entropy, defined as

[TABLE]

with the min-relative entropy of states $\rho$ and $\sigma$ defined as Dat (09)

[TABLE]

and $\Pi_{\rho}$ is the projection onto the support of $\rho$ . Note that the min-relative entropy of states is also equal to the Petz–Rényi relative entropy of order zero Pet (85, 86), as observed in Dat (09).

Let $D_{\max}(\mathcal{N}\|\mathcal{M})$ denote the channel max-relative entropy CMW (16); LKDW (18); GFW*+* (18), defined as

[TABLE]

with the maximally entangled state $\Phi_{RA}$ of Schmidt rank $d$ defined as

[TABLE]

and the max-relative entropy of states $\rho$ and $\sigma$ defined as Dat (09)

[TABLE]

The equality in (34) was proved in GFW*+* (18); BHKW (18).

We then have the following fundamental result for exact distillation and dilution:

Theorem 1

The exact one-shot distillable distinguishability of the channel box $(\mathcal{N},\mathcal{M})$ is equal to the channel min-relative entropy:

[TABLE]

and the exact one-shot distinguishability cost is equal to the channel max-relative entropy:

[TABLE]

The equality in (37) is proved in Appendix C.1, and the equality in (38) is proved in Appendix C.2.

We remark that it is appealing that the exact one-shot distinguishability cost of a channel box has a simple characterization in terms of the Choi states of the channels $\mathcal{N}$ and $\mathcal{M}$ , as indicated by the equality in (34).

V.2 Approximate one-shot distillation and dilution of quantum channel

boxes

We also consider approximate versions of these tasks. The goal of approximate distillation is to transform the channel box $(\mathcal{N},\mathcal{M})$ into as many $\varepsilon$ -approximate bits of asymmetric distinguishability as possible. Mathematically, this corresponds to the following optimization:

[TABLE]

The goal of approximate dilution is to transform as few bits of asymmetric distinguishability into a channel box $(\widetilde{\mathcal{N}},\mathcal{M})$ , such that $\widetilde{\mathcal{N}}\approx_{\varepsilon}\mathcal{N}$ . Mathematically, this corresponds to the following optimization:

[TABLE]

Let $D_{\min}^{\varepsilon}(\mathcal{N}\|\mathcal{M})$ denote the smooth channel min-relative entropy from CMW (16), defined as

[TABLE]

with the optimization being with respect to all pure states $\psi_{RA}$ with system $R$ isomorphic to the channel input system $A$ . The smooth min-relative entropy of states $\rho$ and $\sigma$ is defined as BD (10, 11); WR (12)

[TABLE]

The quantity $D_{\min}^{\varepsilon}(\rho\|\sigma)$ is also known as the hypothesis testing relative entropy WR (12), and $D_{\min}^{\varepsilon}(\mathcal{N}\|\mathcal{M})$ is also known as the channel hypothesis testing relative entropy CMW (16).

Let $D_{\max}^{\varepsilon}(\mathcal{N}\|\mathcal{M})$ denote the smooth channel max-relative entropy (GFW*+*, 18, Definition 19), defined as

[TABLE]

with the optimization being with respect to all quantum channels $\widetilde{\mathcal{N}}$ satisfying $\widetilde{\mathcal{N}}\approx_{\varepsilon}\mathcal{N}$ , in the sense of (20). We note here that the smooth channel max-relative entropy has been studied extensively in LW (19), in the context of resource erasure.

In Appendix C.3, we prove that the smooth channel min- and max-relative entropies can be calculated by semi-definite programs. It follows from these characterizations that the non-smooth quantities can be as well.

We then have the following result, endowing both the smooth channel min- and max-relative entropies with fundamental operational meanings in the context of the resource theory of asymmetric distinguishability:

Theorem 2

The approximate one-shot distillable distinguishability of the channel box $(\mathcal{N},\mathcal{M})$ is equal to the smooth channel min-relative entropy:

[TABLE]

and the approximate one-shot distinguishability cost is equal to the smooth channel max-relative entropy:

[TABLE]

The equality in (44) is proved in Appendix C.4, and the equality in (45) is proved in Appendix C.5.

As a consequence of Theorems 1 and 2, and the facts that

[TABLE]

we conclude the following limits:

[TABLE]

We give alternative proofs of these limits in Appendix C.6.

As an application of the operational approach taken here, we arrive at the following bound relating $D_{\min}^{\varepsilon_{1}}$ and $D_{\max}^{\varepsilon_{2}}$ :

[TABLE]

where $\varepsilon_{1},\varepsilon_{2}\geq 0$ and $\varepsilon_{1}+\varepsilon_{2}<1$ . This bound represents a generalization of a related bound for quantum states in WW (19), and it in fact reduces to it when the channel box $(\mathcal{N},\mathcal{M})$ is environment seizable.

The main idea for arriving at the bound in (50) can be understood as a channel generalization of the operational argument from WW (19). As shown in WW (19), any approximate distillation protocol performed on the state box $(|0\rangle\langle 0|,\pi_{M})$ that leads to the state box $(\widetilde{0}_{\varepsilon},\pi_{K})$ , for $\varepsilon\in[0,1)$ and $\widetilde{0}_{\varepsilon}$ a state such that $\widetilde{0}_{\varepsilon}\approx_{\varepsilon}|0\rangle\langle 0|$ , is required to obey the bound

[TABLE]

One way to realize the full transformation

[TABLE]

is to proceed in two steps: use the equivalence $(|0\rangle\langle 0|,\pi_{M})\leftrightarrow(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{M}})$ , first perform an optimal dilution protocol $(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{M}})\rightarrow(\widetilde{\mathcal{N}},\mathcal{M})$ , where $\widetilde{\mathcal{N}}$ is a channel satisfying $\widetilde{\mathcal{N}}\approx_{\varepsilon_{2}}\mathcal{N}$ such that $\log_{2}M=D_{\max}^{\varepsilon_{2}}(\mathcal{N}\|\mathcal{M})$ and then perform an optimal distillation protocol $(\mathcal{N},\mathcal{M})\rightarrow(\widetilde{\mathcal{R}}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{K}})$ such that $\log_{2}K=D_{\min}^{\varepsilon_{1}}(\mathcal{N}\|\mathcal{M})$ . Finally, we realize the transformation $(\widetilde{\mathcal{R}}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{K}})\rightarrow(\widetilde{0}_{\varepsilon_{1}},\pi_{K})$ by inputting any state to the final channel box. By employing the triangle inequality for the diamond distance, the error of the overall transformation is no larger than $\varepsilon_{1}+\varepsilon_{2}$ . Since the fundamental limitation in (51) applies to any protocol, the bound in (50) follows.

VI Parallel $n$ -shot distillation and dilution of quantum channel boxes

An important case to consider in the resource theory of asymmetric distinguishability for channels is the case of parallel tasks. In particular, we are interested in $n$ -shot parallel distillation and dilution of channel boxes, which essentially amounts to the replacement $(\mathcal{N},\mathcal{M})\rightarrow(\mathcal{N}^{\otimes n},\mathcal{M}^{\otimes n})$ in our previous one-shot results from Section V. However, here we are interested in optimal rates at which one can distill or dilute bits of asymmetric distinguishability from or to a channel box, respectively, both in the exact and approximate cases.

VI.1 Exact case: distillable distinguishability

We define the $n$ -shot, parallel, exact distillable distinguishability of a channel box $(\mathcal{N},\mathcal{M})$ as follows:

[TABLE]

noting that it is equal to the optimal rate at which one can distill exact bits of asymmetric distinguishability for fixed $n\geq 1$ . The equality above is a direct consequence of (37).

The asymptotic parallel exact distillable distinguishability is then defined as

[TABLE]

where the equality is again a direct consequence of (37).

We note that the regularization in (55) seems to be necessary in general, due to the fact that $D_{\min}$ for channels can be non-additive. As an example, suppose that $\mathcal{N}$ is the identity channel and $\mathcal{M}$ is a unitary channel characterized by a unitary operator $U$ . Then it follows that

[TABLE]

It is known from Aci (01) that there are unitaries for which $F(I,U)\in(0,1)$ but $F(I^{\otimes n},U^{\otimes n})=0$ for some finite $n$ . Turning this around, we conclude that there are channels for which

[TABLE]

but

[TABLE]

for some finite $n$ , indicating that the channel min-relative entropy exhibits an extreme form of non-additivity.

A special case for which the exact distillable distinguishability simplifies is for environment-seizable channels. As a consequence of the observation in (12), an immediate conclusion is the following equality:

[TABLE]

which holds for any channel box $(\mathcal{N},\mathcal{M})$ that is environment seizable in the sense of (12). The first equality follows from (12), and the second follows from the additivity of the min-relative entropy for states. We thus conclude that the asymptotic exact parallel distillable distinguishability has the following single-letter formula for the case of environment-seizable channel boxes:

[TABLE]

VI.2 Exact case: distinguishability cost

We define the $n$ -shot, parallel, exact distinguishability cost of a channel box $(\mathcal{N},\mathcal{M})$ as follows:

[TABLE]

noting that it is equal to the optimal rate at which one can dilute exact bits of asymmetric distinguishability to the channel box $(\mathcal{N}^{\otimes n},\mathcal{M}^{\otimes n})$ for fixed $n\geq 1$ . The equality above is a direct consequence of (38) and the additivity of the max-relative entropy of channels, due to the fact that (34) holds.

The asymptotic exact distinguishability cost is then defined as

[TABLE]

where the equality is again a direct consequence of (38).

Thus, exact distinguishability dilution in the parallel case is rather different from exact distinguishability distillation, given that we have a simple single-letter formula characterizing all channel boxes for the former case but not for the latter.

VI.3 Approximate case: distillable distinguishability

We define the $n$ -shot, parallel, $\varepsilon$ -approximate distillable distinguishability as follows:

[TABLE]

noting that it is equal to the optimal rate at which one can distill approximate bits of asymmetric distinguishability for fixed $n\geq 1$ and $\varepsilon\in(0,1)$ . The asymptotic parallel distillable distinguishability of the channel box $(\mathcal{N},\mathcal{M})$ is then defined as the following limit of the above formula:

[TABLE]

where the latter equality follows from (44).

Note that the quantity in (66) is equal to the optimal exponent in Stein’s lemma for the case of parallel quantum channel discrimination CMW (16). The following theorem gives a formal expression for this quantity in terms of the regularized channel relative entropy.

Theorem 3

The parallel distillable distinguishability of the channel box $(\mathcal{N},\mathcal{M})$ is equal to the regularized channel relative entropy:

[TABLE]

and it is finite if and only if $D_{\max}(\mathcal{N}\|\mathcal{M})<\infty$ .

Proof. By exploiting the following bound for states $\rho$ and $\sigma$ WR (12); MW (14); KW (17),

[TABLE]

where $h_{2}(\varepsilon):=-\varepsilon\log_{2}\varepsilon-\left(1-\varepsilon\right)\log_{2}(1-\varepsilon)$ is the binary entropy, we conclude the following bound for channels after an optimization:

[TABLE]

By making the substitution $(\mathcal{N},\mathcal{M})\rightarrow(\mathcal{N}^{\otimes m},\mathcal{M}^{\otimes m})$ , dividing by $m$ , and taking the limit $m\rightarrow\infty$ followed by $\varepsilon\rightarrow 0$ , we conclude that

[TABLE]

Also, note that the following lower bound holds as a consequence of the lower bound from Li (14); TH (13):

[TABLE]

where $\Phi^{-1}$ is the inverse of the cumulative standard normal distribution function, $V_{\varepsilon}(\mathcal{N}\|\mathcal{M})$ is the channel relative entropy variance, defined as

[TABLE]

with $\Pi\,$ the set of all bipartite pure states achieving the optimal value of $D(\mathcal{N}\|\mathcal{M})$ and the relative entropy variance $V(\rho\|\sigma)$ of states $\rho$ and $\sigma$ defined as

[TABLE]

Taking the limit as $n\rightarrow\infty$ and $\varepsilon\rightarrow 0$ , we find that

[TABLE]

However, we can also conclude the following bound

[TABLE]

by making the substitution $(\mathcal{N},\mathcal{M})\rightarrow(\mathcal{N}^{\otimes m},\mathcal{M}^{\otimes m})$ in (72), from which we conclude that

[TABLE]

for all $m\geq 1$ . Since this bound holds for all $m$ , we can take the limit, and when combining with (71), we conclude that

[TABLE]

As observed in (BHKW, 18, Remark 19), the regularized channel relative entropy on the right-hand side is finite if and only if $D_{\max}(\mathcal{N}\|\mathcal{M})<\infty$ .

An important case in which the situation simplifies considerably is for environment-seizable channel boxes, as identified in BHKW (18). As a consequence of the observation in (12), an immediate conclusion is the following equality

[TABLE]

for any channel box $(\mathcal{N},\mathcal{M})$ that is environment seizable in the sense of (12). For such channels, we can even conclude the following expansion:

[TABLE]

so that

[TABLE]

for such environment-seizable channel boxes.

Another important case for which we have a handle on the distillable distinguishability is classical–quantum channel boxes, defined as

[TABLE]

where $\omega_{X}$ is an arbitrary input state, $\{|x\rangle_{X}\}_{x}$ is an orthonormal basis, and $\left\{\rho_{B}^{x}\right\}_{x}$ and $\left\{\sigma_{B}^{x}\right\}_{x}$ are sets of states. An immediate consequence of (BHKW, 18, Corollary 28) is the following equality for classical–quantum channel boxes:

[TABLE]

Eq. (84) indicates that the asymptotic parallel distillable distinguishability of a classical-quantum channel box depends only on the maximum quantum relative entropy that can be realized by the input of a single classical state to the channels.

VI.4 Approximate case: distinguishability cost

We define the $n$ -shot, parallel, $\varepsilon$ -approximate distinguishability cost as follows:

[TABLE]

noting that it is equal to the optimal rate at which one can dilute the channel box $(\mathcal{N}^{\otimes m},\mathcal{M}^{\otimes m})$ approximately from bits of asymmetric distinguishability for fixed $n\geq 1$ and $\varepsilon\in(0,1)$ . The asymptotic parallel distinguishability cost of the channel box $(\mathcal{N},\mathcal{M})$ is then defined as the following limit of the above formula:

[TABLE]

where the latter equality follows from (45).

As a direct consequence of the inequality in (50) and Theorem 3, we find that

[TABLE]

We note that an inequality similar to the above one, which does not include regularization, has been reported as (LW, 19, Theorem 11). Whether the lower bound in (88) is also an upper bound remains an open question. However, the following upper bound holds as a consequence of definitions and the fact that the channel max-relative entropy is single-letter:

[TABLE]

Furthermore, from this upper bound and (BHKW, 18, Remark 19), we conclude that the asymptotic parallel distinguishability cost is finite if and only if $D_{\max}(\mathcal{N}\|\mathcal{M})$ is.

Although we have not been able to solve the asymptotic parallel distinguishability cost in general, we can do so for some interesting special cases. First, for any channel box $(\mathcal{N},\mathcal{M})$ that is environment seizable, in the sense of (12), an immediate conclusion is the following equality:

[TABLE]

Then as a consequence of the asymptotic equipartition property for states TCR (09), by taking the limit $n\rightarrow\infty$ of (90), it follows that

[TABLE]

thus demonstrating a complete understanding of the asymptotic cost for these channel boxes. As in WW (19), one can make refined statements (for second-order expansions) of $\frac{1}{n}D_{c}^{\varepsilon}(\mathcal{N}^{\otimes n},\mathcal{M}^{\otimes n})$ for such channels.

Another important case for which we have a handle on the distinguishability cost are classical–quantum channel boxes $(\mathcal{N}_{X\rightarrow B},\mathcal{M}_{X\rightarrow B})$ , with a common classical input alphabet and output Hilbert space, defined as in (82)–(83):

Proposition 1

Let $(\mathcal{N}_{X\rightarrow B},\mathcal{M}_{X\rightarrow B})$ be a classical–quantum channel box as in (82)–(83). Then the asymptotic parallel distinguishability cost is equal to the channel relative entropy:

[TABLE]

Proof. It is known from BHKW (18) that the following identity holds for classical–quantum channel boxes:

[TABLE]

Thus, the lower bound

[TABLE]

is a direct consequence of (88) and (93).

To establish the upper bound, we make use of Proposition 4 from Appendix D, which states that the following inequality holds for all $\alpha>1$ and $\varepsilon\in(0,1)$ :

[TABLE]

As such, we apply this inequality to the channel box $(\mathcal{N}_{X\rightarrow B}^{\otimes n},\mathcal{M}_{X\rightarrow B}^{\otimes n})$ , as well as (BHKW, 18, Lemma 25), to find that the following inequality holds for all $\alpha>1$ and $\varepsilon\in(0,1)$ :

[TABLE]

Taking the limit as $n\rightarrow\infty$ , we find that the following inequality holds for all $\alpha>1$ :

[TABLE]

Now taking the limit as $\alpha\rightarrow 1$ , we conclude that

[TABLE]

This concludes the proof.

Proposition 1 indicates that the asymptotic parallel distinguishability cost of a classical-quantum channel box depends only on the maximum quantum relative entropy that can be realized by the input of a single classical state to the channels. As such, when combined with the result from (84), we conclude that the resource theory of asymmetric distinguishability is reversible in the asymptotic setting of parallel channel box transformations when restricted to classical–quantum channel boxes, meaning that one can convert between such channel boxes without any loss. We provide further related remarks about this observation in the next section.

VII General channel box transformation: Parallel case

We can now address the general channel box transformation problem for the parallel case. Before doing so, let us formalize the problem. Let $n,m\in\mathbb{Z}^{+}$ and $\varepsilon\in[0,1]$ . An $(n,m,\varepsilon)$ parallel channel box transformation protocol for the channel boxes $(\mathcal{N},\mathcal{M})$ and $(\mathcal{K},\mathcal{L})$ consists of a superchannel $\Theta^{(n)}$ such that

[TABLE]

A rate $R$ is achievable if for all $\varepsilon\in(0,1]$ , $\delta>0$ , and sufficiently large $n$ , there exists an $(n,n\left[R-\delta\right],\varepsilon)$ parallel channel box transformation protocol. The optimal parallel channel box transformation rate $R^{p}((\mathcal{N},\mathcal{M})\rightarrow(\mathcal{K},\mathcal{L}))$ is equal to the supremum of all achievable rates.

On the other hand, a rate $R$ is a strong converse rate if for all $\varepsilon\in[0,1)$ , $\delta>0$ , and sufficiently large $n$ , there does not exist an $(n,n\left[R+\delta\right],\varepsilon)$ parallel channel box transformation protocol. The strong converse parallel channel box transformation rate $\widetilde{R}^{p}((\mathcal{N},\mathcal{M})\rightarrow(\mathcal{K},\mathcal{L}))$ is equal to the infimum of all strong converse rates.

Note that the following inequality is a consequence of the definitions:

[TABLE]

An important result is that if the channel boxes $(\mathcal{N},\mathcal{M})$ and $(\mathcal{K},\mathcal{L})$ are either classical–quantum or environment-seizable, then the following equality holds

[TABLE]

indicating that the channel relative entropy plays a central role as the optimal conversion rate between these kinds of channel boxes. Appendix E provides detailed proofs of converse bounds that justify the claim in (104), by starting with converse bounds for generic one-shot channel box transformation protocols and then applying them to the parallel case of interest (see also Appendix F for how to translate some of these bounds to lower bounds on the smooth channel max-relative entropy). The achievability part follows from combining a distillation protocol with a dilution protocol (as was done for states in WW (19)) and the fact that these tasks have simple characterizations for these channel boxes.

VIII General box transformation: Sequential channels and quantum

strategies

We now move on to consider another variant of the general channel box transformation problem corresponding to the sequential case. This case is more involved than the parallel case considered above because it cannot be reduced to the one-shot case. That is, it is fundamentally a multi-shot problem, and the theory relies upon key developments from CDP (09). As such, we develop the theory more generally for quantum strategies GW (07) or quantum combs CDP (09) and then apply it to sequential channel boxes, which are a special case of quantum strategies. A quantum strategy consists of a sequence of quantum channels, each of which has an accessible input and output, while passing along an internal memory system that can vary in size GW (07). We remark here that there are various terms to refer to this same physical object, including quantum memory channels KW (05); CGLM (14), quantum strategies GW (07); Gut (09, 12); GRS (18), and quantum combs CDP (09), and there are even earlier works where similar notions appear BGNP (01); ESW (02). Here we adopt the terminology “quantum strategy” to refer to such an object.

The main reason for considering the more complicated quantum strategies is that doing so leads to a better understanding and simplification of the analysis of sequential channel boxes, while at the same time providing a significant generalization of the theory. Indeed, regarding this latter point, one might think of generalizing the theory even further by considering physical transformations of quantum strategies and even an infinite hierarchy of this sort, just as we generalized the resource theory of states to channels by considering physical transformations of channels in the form of superchannels. However, a key insight of CDP (09) is that quantum strategies are the end of the line: physical transformations of quantum strategies are simply quantum strategies, so that the hierarchy ends with quantum strategies. Thus, the theory developed here in this sense is a rather general resource theory of asymmetric distinguishability.

VIII.1 Quantum strategies and sequential channel boxes

The basic object to manipulate in this setting is a quantum strategy box or a sequential channel box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ . A sequential channel box is a special case of a quantum strategy box, and since it is simpler, we discuss it briefly first. For a sequential channel box, the notation $\mathcal{N}^{(n)}$ indicates $n$ sequential uses of the channel $\mathcal{N}_{A\rightarrow B}$ and $\mathcal{M}^{(n)}$ indicates $n$ sequential uses of the channel $\mathcal{M}_{A\rightarrow B}$ . Sequential channel boxes have been considered implicitly in previous work on sequential quantum channel discrimination CDP08a ; CDP (09); DFY (09); HHLW (10); CMW (16); BHKW (18).

More generally, a quantum strategy $\mathcal{N}^{(n)}$ consists of a sequence of channels $\mathcal{N}_{A_{1}\rightarrow M_{1}B_{1}}^{1}$ , $\mathcal{N}_{M_{1}A_{2}\rightarrow M_{2}B_{2}}^{2}$ , …, $\mathcal{N}_{M_{n-2}A_{n-1}\rightarrow M_{n-1}B_{n-1}}^{n-1}$ , and $\mathcal{N}_{M_{n-1}A_{n}\rightarrow B_{n}}^{n}$ , and the quantum strategy $\mathcal{M}^{(n)}$ consists of a sequence of channels $\mathcal{M}_{A_{1}\rightarrow M_{1}B_{1}}^{1}$ , $\mathcal{M}_{M_{1}A_{2}\rightarrow M_{2}B_{2}}^{2}$ , …, $\mathcal{M}_{M_{n-2}A_{n-1}\rightarrow M_{n-1}B_{n-1}}^{n-1}$ , and $\mathcal{M}_{M_{n-1}A_{n}\rightarrow B_{n}}^{n}$ . As indicated above, quantum strategies are in one-to-one correspondence with quantum combs GW (07); CDP08a ; CDP (09). In order to have a uniform notation, we sometimes write

[TABLE]

where $M_{0}$ and $M_{n}$ are trivial registers.

It is straightforward to see that a quantum strategy box generalizes a sequential channel box discussed above, with each element of a sequential channel box being a sequence of the same channel without any memory. That is, the sequential channel box is a special case of (105) and (106) with $\mathcal{N}_{M_{i-1}A_{i}\rightarrow M_{i}B_{i}}^{i}=\mathcal{N}_{A_{i}\rightarrow B_{i}}$ and $\mathcal{M}_{M_{i-1}A_{i}\rightarrow M_{i}B_{i}}^{i}=\mathcal{M}_{A_{i}\rightarrow B_{i}}$ for all $i\in\left\{1,\ldots,n\right\}$ .

A quantum co-strategy GW (07) (or tester CDP08a ; CDP (09)) for distinguishing two quantum strategies consists of an input state $\rho_{R_{1}A_{1}}$ and a set of testing channels $\{\mathcal{A}_{R_{i}B_{i}\rightarrow R_{i+1}A_{i+1}}^{i}\}_{i=1}^{n-1}$ , such that the final state when processing the first quantum strategy $\mathcal{N}^{(n)}$ is given by

[TABLE]

and the final state when processing the second quantum strategy $\mathcal{M}^{(n)}$ is given by

[TABLE]

Figure 2 depicts the state $\rho_{R_{n}B_{n}}$ in (107) when $n=3$ .

For our developments in this and the next section, it is helpful to define a generalized quantum strategy divergence as an abstract measure of how distinguishable two quantum strategies are.

Definition 1 (Generalized q. strategy divergence)

The generalized quantum strategy divergence of a quantum strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ is defined as

[TABLE]

where the generalized divergence $\mathbf{D}$ for states is defined by (184), the states $\rho_{R_{n}B_{n}}$ and $\sigma_{R_{n}B_{n}}$ are defined in (107) and (108), respectively, and the optimization is with respect to all quantum co-strategies or testers that could be used to distinguish the quantum strategies $\mathcal{N}^{(n)}$ and $\mathcal{M}^{(n)}$ .

Note that this quantity generalizes the quantum strategy distance and quantum strategy fidelity of CDP08a ; CDP (09); Gut (12); GRS (18), as well as the strategy max-relative entropy of CE (16), to arbitrary divergences. Those quantities employ trace distance, fidelity, and max-relative entropy as the underlying divergences, respectively, but in what follows, we make extensive use of the generality afforded by Definition 1.

VIII.2 Physical transformations of quantum strategy boxes and data

processing

Just as quantum channels model physical transformations of quantum states and superchannels model physical transformations of quantum channels, we can also consider physical transformations of quantum strategies. Given a quantum strategy $\mathcal{N}^{(n)}$ , we consider a general linear and completely positive transformation $\Theta^{(n\rightarrow m)}$ of it, which takes as input an $n$ -round quantum strategy and outputs an $m$ -round quantum strategy. A fundamental result of CDP (09) is that such a physical transformation $\Theta^{(n\rightarrow m)}$ of a quantum strategy $\mathcal{N}^{(n)}$ is in turn described by an $(n+m)$ -round quantum strategy that interconnects with $\mathcal{N}^{(n)}$ to generate an output, $m$ -round quantum strategy.

Due to various choices of time ordering involved, there is not a unique way to describe this physical transformation CDP (09), but here we adopt the choice that the physical transformation $\Theta^{(n\rightarrow m)}$ first processes all channels involved in the quantum strategy $\mathcal{N}^{(n)}$ , and then it generates the output $m$ -round strategy $\Theta^{(n\rightarrow m)}(\mathcal{N}^{(n)})$ . As such, the physical transformation $\Theta^{(n\rightarrow m)}$ consists of $n+m$ channels $\mathcal{F}^{i}$ for $i\in\{1,\ldots,n+m\}$ , and the output quantum strategy $\mathcal{K}^{(m)}=\Theta^{(n\rightarrow m)}(\mathcal{N}^{(n)})$ then consists of the following $m$ channels:

[TABLE]

for $j\in\left\{2,\ldots,m-1\right\}$ , where we identify the memory systems for the output strategy $\mathcal{K}^{(m)}$ as $M_{k}^{\prime}\equiv R_{n+k}$ for $k\in\left\{1,\ldots,m\right\}$ . Figure 3 depicts the transformation of a three-round quantum strategy $\mathcal{N}^{(3)}$ to a three-round quantum strategy $\mathcal{K}^{(3)}$ by a physical transformation $\Theta^{(3\rightarrow 3)}$ consisting of the channels $\mathcal{F}^{1}$ , …, $\mathcal{F}^{6}$ , along with the pairing of the transformed strategy with a quantum co-strategy.

The following data processing inequality for the generalized strategy divergence is a direct consequence of the definition and the fact that the underlying generalized divergence $\mathbf{D}$ obeys data processing. This key property allows for establishing bounds on the general strategy box transformation problem. Also, it generalizes the data processing inequality for strategy distance and strategy fidelity from GRS (18), but we require physical transformations in order to establish it.

Theorem 4

Let $\mathcal{N}^{(n)}$ and $\mathcal{M}^{(n)}$ be $n$ -round quantum strategies, and let $\Theta^{(n\rightarrow m)}$ be a physical transformation of them, of the form discussed above, that leads to $m$ -round quantum strategies $\Theta^{(n\rightarrow m)}(\mathcal{N}^{(n)})$ and $\Theta^{(n\rightarrow m)}(\mathcal{M}^{(n)})$ . Then the following data processing inequality holds for the generalized quantum strategy divergence:

[TABLE]

Proof. The physical transformation $\Theta^{(n\rightarrow m)}$ consists of the channels $\mathcal{F}^{i}$ for $i\in\{1,\ldots,n+m\}$ . Set $\mathcal{K}^{(m)}=\Theta^{(n\rightarrow m)}(\mathcal{N}^{(n)})$ and $\mathcal{L}^{(m)}=\Theta^{(n\rightarrow m)}(\mathcal{M}^{(n)})$ . Also, let us consider a quantum co-strategy for $\mathcal{K}^{(m)}$ and $\mathcal{L}^{(m)}$ , which consists of a state $\rho_{R_{1}^{\prime}C_{1}}$ and a set $\{\mathcal{A}_{R_{i}^{\prime}D_{i}\rightarrow R_{i+1}^{\prime}C_{i+1}}^{i}\}_{i=1}^{m-1}$ of channels:

[TABLE]

Suppose first that the physical transformation $\Theta^{(n\rightarrow m)}$ acts on the quantum strategy $\mathcal{N}^{(n)}$ . In this case, the first channel $\mathcal{F}_{C_{1}\rightarrow R_{1}A_{1}}^{1}$ acts on the state $\rho_{R_{1}^{\prime}C_{1}}$ and outputs systems $A_{1}$ and $R_{1}$ . Then the channel $\mathcal{N}^{1}_{A_{1}\rightarrow M_{1}B_{1}}$ is applied, and the second channel $\mathcal{F}_{R_{1}B_{1}\rightarrow R_{2}A_{2}}^{2}$ is applied. This repeats $n-1$ more times, and the resulting state is as follows:

[TABLE]

At this point, the other elements of the co-strategy and the remainder of the transformation $\Theta^{(n\rightarrow m)}$ are applied, which consists of the co-strategy channels $\{\mathcal{A}_{R_{i}^{\prime}D_{i}\rightarrow R_{i+1}^{\prime}C_{i+1}}^{i}\}_{i=1}^{m-1}$ interleaved by the transformation channels $\mathcal{F}^{n+2}$ , …, $\mathcal{F}^{m}$ . The resulting state is then

[TABLE]

where

[TABLE]

We also define the following states for the quantum strategy $\mathcal{M}^{(n)}$ :

[TABLE]

Then consider that

[TABLE]

The first inequality follows because the state $\rho_{R_{1}^{\prime}C_{1}}$ and the channels $\mathcal{F}^{i}$ for $i\in\{1,\ldots,n\}$ constitute a particular co-strategy for discriminating $\mathcal{N}^{(n)}$ from $\mathcal{M}^{(n)}$ . The next inequality is a consequence of quantum data processing for the underlying generalized divergence, given that $\mathcal{P}_{R_{1}^{\prime}L_{1}D_{1}\rightarrow R_{m}^{\prime}D_{m}}$ is a quantum channel. Since the inequality holds for all possible co-strategies $\mathcal{T}$ that could be used to distinguish $\mathcal{K}^{(m)}$ from $\mathcal{L}^{(m)}$ , we conclude (113).

Remark 1

We note that the data processing inequality in (113) holds more generally for physical transformations of quantum strategy boxes that do not necessarily proceed in the order that we have fixed (i.e., it holds for other time orderings of physical transformations of strategy boxes). The main idea for establishing it is to use the data processing inequality for the underlying generalized divergence and that a co-strategy for a physically transformed strategy is a special kind of co-strategy for the original strategy.

VIII.3 Quantum strategy box transformation problem

The goal of this setting is to convert the quantum strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ to the strategy box $(\mathcal{K}^{(m)},\mathcal{L}^{(m)})$ by means of common physical transformation $\Theta^{(n\rightarrow m)}$ , subject to the constraint that $\mathcal{K}^{(m)}$ is realized approximately from $\mathcal{N}^{(n)}$ , i.e.,

[TABLE]

while $\mathcal{L}^{(m)}$ is realized perfectly from $\mathcal{M}^{(n)}$ by the protocol $\Theta^{(n\rightarrow m)}$ , i.e.,

[TABLE]

just as is the case with all of the other transformations that we have considered in the resource theory of asymmetric distinguishability. The common physical transformation $\Theta^{(n\rightarrow m)}$ that we consider is as we discussed in Section VIII.2 and is depicted in Figures 4 and 5. It consists of a general physical processing of the strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ to convert it approximately to the strategy box $(\mathcal{K}^{(m)},\mathcal{L}^{(m)})$ , in the sense given in (124)–(125).

The notion of approximation that we employ in (124) is the normalized strategy distance of CDP08a ; CDP (09); Gut (12), which generalizes the normalized diamond distance to the setting of interest here. This quantity is a special case of the generalized strategy divergence from Definition 1, with the underlying divergence set to be the normalized trace distance $\frac{1}{2}\left\|\cdot\right\|_{1}$ . The motivation for employing the normalized strategy distance is the same as that which we gave for normalized diamond distance: it quantifies the worst-case statistical error (absolute deviation) that one could make when trying to distinguish the simulation $\Theta^{(n\rightarrow m)}(\mathcal{N}^{(n)})$ from the ideal output strategy $\mathcal{K}^{(m)}$ by any quantum-physical experiment.

We now describe the above in more detail. The general physical transformation $\Theta^{(n\rightarrow m)}$ of the first strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ consists of $n+m$ channels, denoted by $\mathcal{F}^{i}$ for $i\in\left\{1,\ldots,n+m\right\}$ . To assess the performance of the transformation

[TABLE]

in simulating the strategy $\mathcal{K}^{(m)}$ , the resulting quantum strategy $\Theta^{(n\rightarrow m)}(\mathcal{N}^{(n)})$ is paired up with a quantum co-strategy $\mathcal{T}$ GW (07) (or tester CDP08a ; CDP (09)), which consists of a state $\rho_{R^{\prime}C_{1}}$ , a set $\{\mathcal{A}_{R_{i}^{\prime}D_{i}\rightarrow R_{i+1}^{\prime}C_{i+1}}^{i}\}_{i=1}^{m-1}$ of channels, and a final measurement $\{Q_{R_{m}^{\prime}D_{m}},I_{R_{m}^{\prime}D_{m}}-Q_{R_{m}^{\prime}D_{m}}\}$ :

[TABLE]

Suppose first that the transformation $\Theta^{(n\rightarrow m)}$ acts on the quantum strategy $\mathcal{N}^{(n)}$ . In this case, the first channel $\mathcal{F}_{C_{1}\rightarrow R_{1}A_{1}}^{1}$ acts on the state $\rho_{R_{1}^{\prime}C_{1}}$ and outputs systems $A_{1}$ and $R_{1}$ . Then the channel $\mathcal{N}_{A_{1}\rightarrow B_{1}}$ is applied, and the second channel $\mathcal{F}_{R_{1}B_{1}\rightarrow R_{2}A_{2}}^{2}$ is applied. This repeats $n-1$ more times, and the resulting state is as follows:

[TABLE]

At this point, the other elements of the co-strategy and the remainder of the simulation are applied, which consists of the testing channels $\{\mathcal{A}_{R_{i}^{\prime}D_{i}\rightarrow R_{i+1}^{\prime}C_{i+1}}^{i}\}_{i=1}^{m-1}$ interleaved by the simulation channels $\mathcal{F}^{n+2}$ , …, $\mathcal{F}^{m}$ . The resulting state is then

[TABLE]

where

[TABLE]

The final state above is then compared with the following state, which results from the application of the quantum co-strategy $\mathcal{T}$ to the ideal strategy $\mathcal{K}^{(m)}$ :

[TABLE]

See Figure 4 for a depiction of these two scenarios.

The simulation has $\varepsilon$ error if the following inequality holds

[TABLE]

where the optimization is with respect to all quantum co-strategies $\mathcal{T}$ as defined in (127). The expression on the left-hand side above is in fact equal to the $m$ -round normalized quantum strategy distance considered in CDP08a ; CDP (09); Gut (12); GRS (18), so that we can write (132) equivalently as

[TABLE]

As a shorthand for the inequality in (133), we employ the notation

[TABLE]

It is also demanded that the transformation $\Theta^{(n\rightarrow m)}$ be such that $\Theta^{(n\rightarrow m)}(\mathcal{M}^{(n)})=\mathcal{L}^{(m)}$ , which is the same Gut (12) as demanding that

[TABLE]

This is consistent with our prior error criteria in the simpler scenarios for the resource theory of asymmetric distinguishability.

Thus, the general strategy box transformation problem can be phrased as the following optimization problem, which is a function of $n,m\in\mathbb{Z}^{+}$ and channels $\mathcal{N}$ , $\mathcal{M}$ , $\mathcal{K}$ , and $\mathcal{L}$ :

[TABLE]

where the infimum is with respect to physical transformations $\Theta^{(n\rightarrow m)}$ .

We assert here that the optimization problem in (136) can be cast as a semi-definite program, by employing the facts that the quantum strategy distance can be calculated by a semi-definite program and one can write down Choi operators for $\Theta^{(n\rightarrow m)}$ , $\mathcal{N}^{(n)}$ , $\mathcal{M}^{(n)}$ , $\mathcal{K}^{(m)}$ , and $\mathcal{L}^{(m)}$ CDP08a ; CDP (09); Gut (12) along with various non-signaling constraints to denote the time-orderings involved. However, we do not elaborate on the details here.

In Appendix G, Proposition 14 states converse bounds that apply to arbitrary protocols that transform the $n$ -round strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ to the $m$ -round strategy box $(\mathcal{K}^{(m)},\mathcal{L}^{(m)})$ while satisfying $\Theta^{(n\rightarrow m)}(\mathcal{N}^{(n)})\approx_{\varepsilon}\mathcal{K}^{(m)}$ and $\Theta^{(n\rightarrow m)}(\mathcal{M}^{(n)})=\mathcal{L}^{(m)}$ . The bounds are expressed in terms of strategy Rényi divergences, which are defined as special cases of Definition 1 with the underlying divergence fixed to be the Rényi divergences.

VIII.4 Asymptotic setting for sequential channel box transformations

It does not seem sensible to consider an asymptotic version of the general strategy box transformation problem, as in general there is no regular structure associated with arbitrary strategy boxes. However, if we impose some structure, then it is sensible to do so.

The simplest structure that we can impose is that each strategy box is actually a sequential channel box, involving sequential uses of the same quantum channels. Then we can phrase the sequential channel box transformation problem in an asymptotic, Shannon-theoretic way, similar to how we did for the parallel channel box transformation problem in Section VII.

Let $n,m\in\mathbb{Z}^{+}$ and $\varepsilon\in\left[0,1\right]$ . An $(n,m,\varepsilon)$ sequential channel box transformation protocol for the channel boxes $(\mathcal{N},\mathcal{M})$ and $(\mathcal{K},\mathcal{L})$ consists of a physical transformation $\Theta^{(n\rightarrow m)}$ , as described in Section VIII.2, such that

[TABLE]

where $\mathcal{N}^{(n)}$ , $\mathcal{M}^{(n)}$ , $\mathcal{K}^{(m)}$ , and $\mathcal{L}^{(m)}$ are the sequential channels corresponding to the channels $\mathcal{N}$ , $\mathcal{M}$ , $\mathcal{K}$ , and $\mathcal{L}$ , respectively. For clarity, Figure 6 depicts an example of a sequential channel box transformation protocol.

A rate $R$ is achievable if for all $\varepsilon\in(0,1]$ , $\delta>0$ , and sufficiently large $n$ , there exists an $\left(n,n\left[R-\delta\right],\varepsilon\right)$ sequential channel box transformation protocol. The optimal sequential channel box transformation rate $R((\mathcal{N},\mathcal{M})\rightarrow(\mathcal{K},\mathcal{L}))$ is equal to the supremum of all achievable rates.

On the other hand, a rate $R$ is a strong converse rate if for all $\varepsilon\in[0,1)$ , $\delta>0$ , and sufficiently large $n$ , there does not exist an $(n,n\left[R+\delta\right],\varepsilon)$ sequential channel box transformation protocol. The strong converse sequential channel box transformation rate $\widetilde{R}((\mathcal{N},\mathcal{M})\rightarrow(\mathcal{K},\mathcal{L}))$ is equal to the infimum of all strong converse rates.

The following inequality is a direct consequence of definitions:

[TABLE]

Although it is a challenging question in general to determine the optimal rates in (139) for arbitrary channel boxes, there are some special cases for which it is possible to determine them.

If the channel boxes $(\mathcal{N},\mathcal{M})$ and $(\mathcal{K},\mathcal{L})$ are environment-seizable, then our prior results from WW (19) and Corollary 2 from Appendix G.1 imply that

[TABLE]

The main reason that this simplification occurs is that the channels involved for environment-seizable pairs are equivalent to states, so that the prior achievability results for states WW (19) apply. Also, the converse bounds from Appendix G.1 simplify for the same reason. 2. 2.

If the channel boxes $(\mathcal{N},\mathcal{M})$ and $(\mathcal{K},\mathcal{L})$ are classical–quantum, then the following strong converse bound holds

[TABLE]

as a consequence of (BHKW, 18, Lemma 26) and the discussion in Appendix G.2. It is reasonable to conjecture that this bound is saturated—what remains is to show that $D(\mathcal{N}\|\mathcal{M})$ is the optimal rate of distinguishability dilution for classical–quantum channels. 3. 3.

If the channel box $(\mathcal{N},\mathcal{M})$ is classical–quantum and $(\mathcal{K},\mathcal{L})$ is environment seizable, then the equalities in (140)–(141) hold. This is a consequence of the upper bound in (142) holding in this case, while the lower bound $R((\mathcal{N},\mathcal{M})\rightarrow(\mathcal{K},\mathcal{L}))\geq\frac{D(\mathcal{N}\|\mathcal{M})}{D(\mathcal{K}\|\mathcal{L})}$ follows because one can first distill bits of asymmetric distinguishability from $(\mathcal{N},\mathcal{M})$ at the rate $D(\mathcal{N}\|\mathcal{M})$ and then dilute them to $(\mathcal{K},\mathcal{L})$ , in a sequential simulation, with the latter simulation being possible easily by preparing the environment states for $(\mathcal{K},\mathcal{L})$ and then acting with the relevant common channels on demand when needed.

IX Distillation and dilution of quantum strategy and sequential channel

boxes

In this section, we present distillation and dilution of quantum strategy boxes. A special case of this theory involves distillation and dilution of sequential channel boxes. Here we are interested in not only in the optimal number but also rates at which one can distill or dilute bits of asymmetric distinguishability from or to a strategy or sequential channel box, respectively, both in the exact and approximate cases.

All of the basic definitions in this case represent generalizations of what we have presented previously for one-shot tasks regarding quantum channels. As such, we do not delve into as many details as we did before but mainly state the results and provide brief justifications.

IX.1 Exact case: distillable distinguishability

Given a strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ , the exact distillable distinguishability is equal to the largest $M$ such that we can transform $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ to the channel box $(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{M}})$ exactly by means of a physical transformation $\Theta^{(n\rightarrow 1)}$ . Note that the physical transformation $\Theta^{(n\rightarrow 1)}$ is a special case of those that we discussed previously in Section VIII.2, taking an $n$ -round quantum strategy box to a channel box. Mathematically, the exact distillable distinguishability is defined as the following optimization problem:

[TABLE]

Note that this problem is essentially equivalent to $\left\lfloor D_{d}^{0}(\mathcal{N}^{(n)},\mathcal{M}^{(n)})\right\rfloor$ , which is the largest $m$ for which a physical transformation $\Theta^{(n\rightarrow m)}$ exists such that

[TABLE]

where the superscript $(m)$ indicates $m$ sequential channel uses. This is because the channel box $(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{2^{m}}})$ and the sequential channel box $((\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|})^{(m)},(\mathcal{R}_{C\rightarrow D}^{\pi})^{(m)})$ are equivalent to each other by means of common quantum strategies, due to the fact that the underlying channel pairs are environment seizable and thus equivalent to state boxes.

By employing reasoning similar to that which we employed previously to justify (37), we conclude that

[TABLE]

where $D_{\min}(\mathcal{N}^{(n)}\|\mathcal{M}^{(n)})$ is the quantum strategy divergence from Definition 1, with $\mathbf{D}$ therein set to $D_{\min}$ . The main reasons that this equality holds are that 1) the optimal co-strategy for $D_{\min}(\mathcal{N}^{(n)}\|\mathcal{M}^{(n)})$ leads to a protocol for distilling bits of asymmetric distinguishability and 2) its optimality follows from the data processing inequality (Theorem 4) for $D_{\min}(\mathcal{N}^{(n)}\|\mathcal{M}^{(n)})$ with respect to an arbitrary physical transformation $\Theta^{(n\rightarrow 1)}$ that produces the channel box $(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{M}})$ exactly.

If the strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ is in fact a sequential channel box for all $n$ , with corresponding channels $\mathcal{N}$ and $\mathcal{M}$ , then we define the exact sequential distillable distinguishability as

[TABLE]

Just as with the parallel case discussed in Section IX.1, the underlying quantity $D_{d}^{0}(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ can jump from zero to $\infty$ as $n$ increases. In fact, this jump can occur in the simplest case when $n$ goes from one to two HHLW (10). By the general bound from BHKW (18), we have that

[TABLE]

IX.2 Exact case: distinguishability cost

Given a strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ , the exact distinguishability cost is equal to the smallest $M$ such that we can transform the channel box $(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{M}})$ to $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ exactly by means of a physical transformation $\Theta^{(1\rightarrow n)}$ . Note that the physical transformation $\Theta^{(1\rightarrow n)}$ is a special case of those that we discussed previously in Section VIII.2, taking a channel box to an $n$ -round quantum strategy box. Mathematically, the exact distinguishability cost is defined as the following optimization problem:

[TABLE]

For similar reasons stated in the previous section, this problem is essentially equivalent to $\left\lceil D_{c}^{0}(\mathcal{N}^{(n)},\mathcal{M}^{(n)})\right\rceil$ , which is the smallest $m$ for which a physical transformation $\Theta^{(m\rightarrow n)}$ exists such that

[TABLE]

where the superscript $(m)$ again indicates $m$ sequential channel uses.

By employing reasoning similar to that which we used previously to justify (38), we conclude that

[TABLE]

where $D_{\max}(\mathcal{N}^{(n)}\|\mathcal{M}^{(n)})$ is the quantum strategy divergence from Definition 1, with $\mathbf{D}$ therein set to $D_{\max}$ . The quantity $D_{\max}(\mathcal{N}^{(n)}\|\mathcal{M}^{(n)})$ has already been defined in and studied in CE (16), wherein it was shown that it is equal to the max-relative entropy of the Choi operators of the strategies. Eq. (152) gives $D_{\max}(\mathcal{N}^{(n)}\|\mathcal{M}^{(n)})$ its fundamental operational meaning in terms of the exact distinguishability cost of the strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ .

The main reasons that the equality in (152) holds are that 1) an optimal dilution protocol, generalizing that from Appendix C.2, results from a strategy that outputs strategy $\mathcal{N}^{(n)}$ if $|0\rangle\langle 0|$ is input and outputs the strategy

[TABLE]

if $|1\rangle\langle 1|$ is input and 2) its optimality follows from the data processing inequality (Theorem 4) for $D_{\max}(\mathcal{N}^{(n)}\|\mathcal{M}^{(n)})$ with respect to an arbitrary physical transformation $\Theta^{(1\rightarrow n)}$ that produces the strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ exactly from the channel box $(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{M}})$ .

If the strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ is in fact a sequential channel box for all $n$ , with corresponding channels $\mathcal{N}$ and $\mathcal{M}$ , then we define the exact sequential distinguishability cost as

[TABLE]

Note that the following inequality holds

[TABLE]

because a sequential simulation is more stringent than a parallel simulation. That is, any sequential simulation works as a parallel simulation.

A key result that we have for this problem, strengthening our earlier finding from (64), is expressed by the following theorem.

Theorem 5

For channels $\mathcal{N}$ and $\mathcal{M}$ , the exact sequential distinguishability cost is equal to the channel max-relative entropy:

[TABLE]

Proof. The inequality $D_{c}^{0}(\mathcal{N},\mathcal{M})\geq D_{\max}(\mathcal{N}\|\mathcal{M})$ is a consequence of (155) and (64). The other inequality is a consequence of the following scheme for simulating the sequential channel box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ , similar to that employed in GFW*+* (18); WW (18). In the first round of the sequential simulation, one starts from the channel box $(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{M}})$ and simulates the tensor product channel box $(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|}\otimes\mathcal{N},\mathcal{R}_{C\rightarrow D}^{\pi_{M_{1}}}\otimes\mathcal{M})$ . Employing (38), the cost for doing so is

[TABLE]

In the next round, one uses the leftover channel box $(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{M_{1}}})$ to simulate the channel box $(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|}\otimes\mathcal{N},\mathcal{R}_{C\rightarrow D}^{\pi_{M_{2}}}\otimes\mathcal{M})$ . Again employing (38) and an analysis similar to the above, the cost for doing so is

[TABLE]

This continues until the last round, and adding everything up, the total cost for the simulation of the sequential channel box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ is $nD_{\max}(\mathcal{N}\|\mathcal{M})$ . Since this holds for every $n$ , we conclude that $D_{c}^{0}(\mathcal{N},\mathcal{M})\leq D_{\max}(\mathcal{N}\|\mathcal{M})$ , and in turn, we conclude (156).

IX.3 Approximate case: distillable distinguishability

Given a strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ , the approximate distillable distinguishability is equal to the largest $M$ such that we can transform the strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ to the channel box $(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{M}})$ approximately by means of a physical transformation $\Theta^{(n\rightarrow 1)}$ . Mathematically, it is defined as the following optimization problem:

[TABLE]

where the shorthand $\approx_{\varepsilon}$ is defined in (133)–(134) in terms of the normalized strategy distance.

For similar reasons stated in the previous section, this problem is essentially equivalent to $\left\lfloor D_{d}^{\varepsilon}(\mathcal{N}^{(n)},\mathcal{M}^{(n)})\right\rfloor$ , which is the largest $m$ for which a physical transformation $\Theta^{(n\rightarrow m)}$ exists such that

[TABLE]

By employing reasoning similar to that which we used previously to justify (44), we conclude that

[TABLE]

where $D_{\min}^{\varepsilon}(\mathcal{N}^{(n)}\|\mathcal{M}^{(n)})$ is the quantum strategy divergence from Definition 1, with $\mathbf{D}$ therein set to $D_{\min}^{\varepsilon}$ . The main reasons that this equality holds are that 1) the optimal co-strategy for $D_{\min}^{\varepsilon}(\mathcal{N}^{(n)}\|\mathcal{M}^{(n)})$ leads to a protocol for distilling bits of asymmetric distinguishability approximately and 2) its optimality follows from the data processing inequality for $D_{\min}^{\varepsilon}(\mathcal{N}^{(n)}\|\mathcal{M}^{(n)})$ with respect to an arbitrary physical transformation $\Theta^{(n\rightarrow 1)}$ that produces the channel box $(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{M}})$ approximately.

If the strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ is in fact a sequential channel box for all $n$ , with corresponding channels $\mathcal{N}$ and $\mathcal{M}$ , then we define the sequential distillable distinguishability as

[TABLE]

A key result of our paper is the following formal expression for $D_{d}(\mathcal{N},\mathcal{M})$ in terms of the amortized channel relative entropy from BHKW (18):

Theorem 6

For channels $\mathcal{N}$ and $\mathcal{M}$ , the sequential distillable distinguishability is equal to the amortized channel relative entropy of BHKW (18):

[TABLE]

where

[TABLE]

Proof. The bound

[TABLE]

follows from (BHKW, 18, Proposition 16), due to the equivalence between sequential distillable distinguishability and the optimal rate of the quantum hypothesis testing problem considered in BHKW (18). So it remains to establish the opposite inequality.

To do so, here we employ a technique used in the resource theory of coherence (GFW*+*, 18, Theorem 17), which was used therein to show that the amortized relative entropy of coherence is equal to the distillable coherence of a quantum channel. A similar technique was also discussed previously in (BHLS, 03, Section 2.4).

Let $\rho_{RA}$ and $\sigma_{RA}$ be arbitrary quantum states. Let $\psi_{RA}$ be a state such that

[TABLE]

(If such a state does not exist, then $D_{d}(\mathcal{N},\mathcal{M})$ is trivially equal to zero.) The first step is to send in the tensor-power state $\psi_{RA}^{\otimes m}$ to $m$ parallel calls of the unknown channel, where

[TABLE]

for $\delta>0$ , and distill bits of asymmetric distinguishability at the rate $D(\mathcal{N}_{A\rightarrow B}(\psi_{RA})\|\mathcal{M}_{A\rightarrow B}(\psi_{RA}))$ . Second, we dilute these bits of asymmetric distinguishability to the state box $(\rho_{RA}^{\otimes n},\sigma_{RA}^{\otimes n})$ . Third, we then send this state box into $n$ uses of the unknown channel, producing the state box $([\mathcal{N}_{A\rightarrow B}(\rho_{RA})]^{\otimes n},[\mathcal{M}_{A\rightarrow B}(\sigma_{RA})]^{\otimes n})$ . Fourth, from this state box, we distill bits of asymmetric distinguishability at the rate $D(\mathcal{N}_{A\rightarrow B}(\rho_{RA})\|\mathcal{M}_{A\rightarrow B}(\sigma_{RA}))-\delta$ . We output a fraction $R-2\delta$ of these bits, where

[TABLE]

and then reinvest a fraction $D(\rho_{RA}\|\sigma_{RA})+\delta$ for the next round. We then repeat steps 2) through 4) $k$ times. In the last round, a fraction $R_{f}-\delta$ bits of asymmetric distinguishability are output, where

[TABLE]

and no reinvestment is made (because it is the last round). Counting up everything, this protocol calls the unknown channel $kn+m$ times, while outputting

[TABLE]

bits of asymmetric distinguishability. Thus, the rate of the protocol is given by

[TABLE]

In the limit as $k\rightarrow\infty$ , this rate converges to $R-2\delta$ . Since $\delta>0$ is arbitrary, the rate $R$ is achievable. Note that all of the conversions stated above are approximate, but for large enough $n$ and by employing the triangle inequality, the error vanishes. Finally, since the states $\rho_{RA}$ and $\sigma_{RA}$ are arbitrary, we can take a supremum over all of them and conclude the inequality

[TABLE]

thus completing the proof.

Theorem 6 establishes an operational meaning for the amortized channel relative entropy of BHKW (18), thus giving it some distinction in the resource theory of asymmetric distinguishability for quantum channels. Theorem 6 can alternatively be understood as a formal solution to Stein’s lemma for quantum channels in the sequential setting, thus completing the line of reasoning put forward in BHKW (18).

More generally, this result can be used to determine whether a sequential protocol is truly necessary to attain the optimal distillable distinguishability. If an amortization collapse occurs for a pair of channels, so that $D^{\mathcal{A}}(\mathcal{N}\|\mathcal{M})=D(\mathcal{N}\|\mathcal{M})$ , then one can conclude that a sequential protocol is not necessary and one can simply input a tensor-power state $\psi_{RA}^{\otimes n}$ to distinguish the channels optimally in the asymptotic regime BHKW (18). This collapse occurs for both environment-seizable and classical–quantum channel boxes. It also occurs for channel boxes in which the first channel is arbitrary and the second is a replacer channel CMW (16); BHKW (18). What Theorems 3 and 6 add to this story is that the condition

[TABLE]

is necessary and sufficient for an adaptive strategy to have an advantage over a parallel strategy in the setting of asymmetric channel discrimination, or equivalently, when distilling bits of asymmetric distinguishability. Determining whether (176) holds for a pair of quantum channels is an interesting and challenging open problem.

It seems that the main idea of (GFW*+*, 18, Theorem 17) (also (BHLS, 03, Section 2.4)), as used in the proof of Theorem 6, can be employed for a sequential distillation task in any quantum resource thery for which the static version of the theory (for quantum states) is asymptotically reversible. This is because the interleaving of distillation and dilution plays an essential role in the given protocol, and for an asymptotically reversible resource theory, there is no loss when going back and forth like this.

IX.4 Approximate case: distinguishability cost

Given a strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ , the approximate distinguishability cost is equal to the smallest $M$ such that we can transform the channel box $(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{M}})$ to $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ approximately by means of a physical transformation $\Theta^{(1\rightarrow n)}$ . Mathematically, it is defined as the following optimization problem:

[TABLE]

For similar reasons stated previously, this problem is essentially equivalent to $\left\lceil D_{c}^{\varepsilon}(\mathcal{N}^{(n)},\mathcal{M}^{(n)})\right\rceil$ , which is the smallest $m$ for which a physical transformation $\Theta^{(m\rightarrow n)}$ exists such that

[TABLE]

where the superscript $(m)$ again indicates $m$ sequential channel uses.

By employing reasoning similar to that which we used previously to justify (45), we conclude that

[TABLE]

where the smooth strategy max-relative entropy is defined as

[TABLE]

and $D_{\max}(\widetilde{\mathcal{N}}^{(n)}\|\mathcal{M}^{(n)})$ is defined in (152). The infimum in (181) is with respect to $n$ -round strategies $\widetilde{\mathcal{N}}^{(n)}$ that are $\varepsilon$ -close in normalized strategy distance to the strategy $\mathcal{N}^{(n)}$ . The main reasons that this equality holds are that 1) an optimal approximate dilution protocol results from applying an optimal exact dilution protocol to $\widetilde{\mathcal{N}}^{(n)}$ and $\mathcal{M}^{(n)}$ , where $\widetilde{\mathcal{N}}^{(n)}$ is $\varepsilon$ -close to $\mathcal{N}^{(n)}$ with respect to the normalized strategy distance and 2) its optimality follows from the data processing inequality for $D_{\max}^{\varepsilon}(\mathcal{N}^{(n)}\|\mathcal{M}^{(n)})$ with respect to an arbitrary physical transformation $\Theta^{(1\rightarrow n)}$ that produces the strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ approximately from the channel box $(\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|},\mathcal{R}_{C\rightarrow D}^{\pi_{M}})$ .

If the strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ is in fact a sequential channel box for all $n$ , with corresponding channels $\mathcal{N}$ and $\mathcal{M}$ , then we define the sequential distinguishability cost as

[TABLE]

Note that the following inequality holds

[TABLE]

because a sequential simulation is more stringent than a parallel simulation. That is, any sequential simulation works as a parallel simulation.

As occurred for all other tasks in this paper, the sequential distinguishability cost simplifies for environment-seizable channel boxes. It remains an interesting open question to understand the sequential distinguishability cost of quantum channel boxes other than environment-seizable ones.

X Conclusion

In this paper, we generalized the resource theory of asymmetric distinguishability from states Mat (10, 11); WW (19) to channels. In this resource theory, the main constituents are quantum channel boxes that can be manipulated by means of a quantum superchannel, the most general physical transformation that sends quantum channels to quantum channels. Furthermore, the basic units of currency are bits of asymmetric distinguishability WW (19).

In the one-shot scenario, we considered the approximate channel box transformation problem and proved that it is characterized by a semi-definite program. As special cases of this, we considered exact and approximate one-shot distillation and dilution of channel boxes, arriving at the following conclusions:

The exact one-shot distillable distinguishability of a channel box is equal to the channel min-relative entropy. 2. 2.

The exact one-shot distinguishability cost of a channel box is equal to the channel max-relative entropy. 3. 3.

The approximate one-shot distillable distinguishability of a channel box is equal to the smooth channel min-relative entropy. 4. 4.

The approximate one-shot distinguishability cost of a channel box is equal to the smooth channel max-relative entropy.

These results endow these fundamental channel measures of distinguishability with operational interpretations.

We then moved on to consider asymptotic parallel versions of the above tasks, with our key findings here being that the parallel distillable distinguishability is equal to the regularized channel relative entropy and the parallel exact distinguishability cost is equal to the channel max-relative entropy. We solved the asymptotic version of the parallel channel box transformation problem for environment-seizable and classical–quantum channel boxes.

We finally considered the approximate strategy box transformation problem and asserted that it is characterized by a semi-definite program. We introduced the generalized strategy divergence as a way of quantifying distinguishability of quantum strategies and used instantiations of this concept to provide bounds on how well one can convert one strategy box to another. In particular, transformations of sequential channel boxes are a special case of transformations of strategy boxes, so that many of the results for strategy boxes apply directly, and all of the results simplify for environment-seizable or classical–quantum sequential channel boxes.

By focusing on distillation and dilution tasks, we proved that the asymptotic sequential distillable distinguishability of a sequential channel box is equal to the amortized channel relative entropy of BHKW (18), thus endowing this quantity with a fundamental operational meaning. We also proved that the exact sequential distinguishability cost is equal to the channel max-relative entropy.

Going forward from here, there are many open questions for future work. Are there other channel boxes, besides environment-seizable and classical–quantum ones, for which the theory simplifies significantly? Based on the distillation results of CMW (16), and other findings of FWTB (19), it seems plausible that the channel relative entropy should be the optimal rate for dilution protocols of channel boxes in which the first channel is arbitrary and the second is a replacer channel. Are there examples of channel boxes for which the regularization in the regularized channel relative entropy is necessary? Are there examples of channel boxes for which the amortized channel relative entropy does not collapse to the ordinary channel relative entropy? Answers to these questions would provide insights as to whether general parallel or sequential strategies are helpful in distinguishability distillation. Can we characterize the asymptotic parallel or sequential distinguishability cost, in the case in which the simulation need not be exact but with vanishing error in the asymptotic limit? Is it possible to give a more general theory beyond independent and identically distributed channels, i.e., for memory channels with some structure? These and other questions remain the subject of future investigations.

Note: After our paper appeared online, the preprint FFRS (19) was posted, which has addressed some of the open questions from our paper.

Acknowledgements.

We are grateful to Vishal Katariya for pointing out a problem with our previous formulation of the semi-definite program for the smooth channel max-relative entropy. We are grateful to Andreas Winter for pointing out (BHLS, 03, Section 2.4) in the context of Theorem 6. XW acknowledges support from the Department of Defense, and MMW acknowledges support from the National Science Foundation under Grant No. 1907615.

Appendix A Background

A.1 Generalized divergences

A generalized divergence is a function $\mathbf{D}(\rho\|\sigma)$ taking arbitrary quantum states $\rho$ and $\sigma$ to the non-negative reals and such that the data processing inequality holds for an arbitrary quantum channel $\mathcal{N}$ [88]:

[TABLE]

Generalized divergences of interest include the trace distance, the negative logarithm of the fidelity [94], the quantum relative entropy [95], the Petz–Rényi relative entropy [82, 83], and the sandwiched Rényi relative entropy [77, 103].

For completeness, we define the last three quantities now and refer to our companion paper [101] for further details of their properties. The quantum relative entropy $D(\rho\|\sigma)$ is defined for states $\rho$ and $\sigma$ as

[TABLE]

if $\operatorname{supp}(\rho)\subseteq\operatorname{supp}(\sigma)$ and it is set to $\infty$ otherwise. The Petz–Rényi relative entropy is defined for states $\rho$ and $\sigma$ as [83]

[TABLE]

if $\alpha\in(0,1)$ or $\alpha\in(1,\infty)$ and $\operatorname{supp}(\rho)\subseteq\operatorname{supp}(\sigma)$ . If $\alpha\in(1,\infty)$ and $\operatorname{supp}(\rho)\not\subseteq\operatorname{supp}(\sigma)$ , then $D_{\alpha}(\rho\|\sigma):=\infty$ [89]. The sandwiched Rényi relative entropy is defined for states $\rho$ and $\sigma$ as [77, 103]

[TABLE]

if $\alpha\in(0,1)$ or $\alpha\in(1,\infty)$ and $\operatorname{supp}(\rho)\subseteq\operatorname{supp}(\sigma)$ . If $\alpha\in(1,\infty)$ and $\operatorname{supp}(\rho)\not\subseteq\operatorname{supp}(\sigma)$ , then $\widetilde{D}_{\alpha}(\rho\|\sigma):=\infty$ .

A generalized channel divergence is defined from that for states, as presented above, given by the following function of quantum channels $\mathcal{N}_{A\rightarrow B}$ and $\mathcal{M}_{A\rightarrow B}$ [72]:

[TABLE]

where the optimization is with respect to a quantum state $\rho_{RA}$ such that the reference system is arbitrary. As observed in [72], the following simplification holds

[TABLE]

where the optimization is with respect to pure states $\psi_{RA}$ such that the reference system $R$ is isomorphic to the channel input system $A$ .

The data processing inequality holds for the generalized channel divergence, with respect to a superchannel $\Theta$ :

[TABLE]

as proved in [49]. The inequality in (190) follows from the definition in (188) and the fact that the underlying generalized divergence $\mathbf{D}$ obeys the data processing inequality in (184). Other applications and interpretations of channel divergences were considered in [105].

For an environment-parametrized channel box $(\mathcal{N},\mathcal{M})$ with environment states $\rho_{E}$ and $\sigma_{E}$ , the following inequality holds [93]

[TABLE]

If the channel box is also environment seizable (see Section III.1), then the opposite inequality $\mathbf{D}(\mathcal{N}\|\mathcal{M})\geq\mathbf{D}(\rho_{E}\|\sigma_{E})$ holds as well (as a consequence of (190)), from which we conclude the following equality in this case:

[TABLE]

Particular examples of generalized channel divergences are the channel min-relative entropy in (31), the smooth channel min-relative entropy in (41), and the channel max-relative entropy in (33). Other examples include those built from the relative entropy, the Petz–Rényi relative entropy, and the sandwiched Rényi relative entropy, as defined in [33]. As such, the inequality in (190) holds for all of these channel divergences, a property that we make extensive use of in what follows.

It is not clear how to write the smooth channel max-relative entropy in (43) as a generalized channel divergence. However, it does obey the data processing inequality in (190), as the following simple argument demonstrates. Let $\mathcal{N}$ and $\mathcal{M}$ be arbitrary channels, and let $\Theta$ be a superchannel. Let $\widetilde{\mathcal{N}}$ be a channel satisfying $\widetilde{\mathcal{N}}\approx_{\varepsilon}\mathcal{N}$ . Then, from the data processing inequality for the diamond distance with respect to superchannels, it follows that $\Theta(\widetilde{\mathcal{N}})\approx_{\varepsilon}\Theta(\mathcal{N})$ . We then have that

[TABLE]

The first inequality follows from the data processing inequality for $D_{\max}$ of channels, and the second follows from the definition of the smooth channel max-relative entropy and the fact that $\Theta(\widetilde{\mathcal{N}})\approx_{\varepsilon}\Theta(\mathcal{N})$ . Since the inequality holds for all $\widetilde{\mathcal{N}}$ satisfying $\widetilde{\mathcal{N}}\approx_{\varepsilon}\mathcal{N}$ , we conclude the desired data processing inequality:

[TABLE]

A.2 Choi isomorphism for quantum channels

The Choi isomorphism is a way of characterizing quantum channels that is suitable for optimizing over them in semi-definite programs. For a quantum channel $\mathcal{N}_{A\rightarrow B}$ , its Choi operator is given by

[TABLE]

where $\Gamma_{RA}=|\Gamma\rangle\langle\Gamma|_{RA}$ and

[TABLE]

with $\{|i\rangle_{R}\}_{i}$ and $\{|i\rangle_{A}\}$ orthonormal bases. The Choi operator is positive semi-definite $\Gamma_{RB}^{\mathcal{N}}\geq 0$ , corresponding to $\mathcal{N}_{A\rightarrow B}$ being completely positive, and satisfies $\operatorname{Tr}_{B}[\Gamma_{RB}^{\mathcal{N}}]=I_{R}$ , the latter corresponding to $\mathcal{N}_{A\rightarrow B}$ being trace preserving. On the other hand, given an operator $\Gamma_{RB}^{\mathcal{M}}$ satisfying $\Gamma_{RB}^{\mathcal{M}}\geq 0$ and $\operatorname{Tr}_{B}[\Gamma_{RB}^{\mathcal{M}}]=I_{R}$ , one realizes via postselected teleportation [14] the following quantum channel:

[TABLE]

where systems $S$ , $R$ , and $A$ are isomorphic and the last line employs the facts that $\left(M_{S}\otimes I_{R}\right)|\Gamma\rangle_{SR}=\left(I_{S}\otimes T_{R}(M_{R})\right)|\Gamma\rangle_{SR}$ for $T_{R}$ the transpose map, defined as

[TABLE]

and $\langle\Gamma|_{SR}\left(I_{S}\otimes X_{RB}\right)|\Gamma\rangle_{SR}=\operatorname{Tr}_{R}[X_{RB}]$ . We often abbreviate the transpose map simply as

[TABLE]

Since the constraints $\Gamma_{RB}^{\mathcal{M}}\geq 0$ and $\operatorname{Tr}_{B}[\Gamma_{RB}^{\mathcal{M}}]=I_{R}$ are semi-definite, this is a useful way of incorporating optimizations over quantum channels into semi-definite programs.

A.3 Semi-definite programs for diamond distance

The normalized diamond distance between quantum channels $\mathcal{N}_{A\rightarrow B}$ and $\mathcal{M}_{A\rightarrow B}$ is given by the following primal and dual semi-definite programs [96]:

[TABLE]

The latter expression is equal to

[TABLE]

A.4 Choi isomorphism for quantum superchannels

Just as there is a Choi isomorphism for quantum channels, as reviewed in Appendix A.2, there is a Choi isomorphism for quantum superchannels [28, 49]. To define it, we can exploit the known result that a quantum superchannel $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ is in one-to-one correspondence with a bipartite channel $\mathcal{L}_{CB\rightarrow AD}$ that has no-signaling constraints [28, 49]. That is, as stated in (15), every superchannel $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ can be physically realized by means of pre- and post-processing channels $\mathcal{E}_{C\rightarrow AM}$ and $\mathcal{D}_{BM\rightarrow D}$ , respectively, such that (15) holds. The bipartite channel corresponding to $\mathcal{E}_{C\rightarrow AM}$ and $\mathcal{D}_{BM\rightarrow D}$ is then given by

[TABLE]

i.e., where we do not “plug in” the channel $\mathcal{N}_{A\rightarrow B}$ to the ports $A$ and $B$ , and instead system $B$ is available as input and system $A$ is available as output. On the other hand, suppose that $\mathcal{L}_{CB\rightarrow AD}$ is a bipartite channel with the constraint that it is no-signaling from input system $B$ to output system $A$ . Then there exist channels $\mathcal{E}_{C\rightarrow AM}$ and $\mathcal{D}_{BM\rightarrow D}$ such that $\mathcal{L}_{CB\rightarrow AD}$ can be realized as in (209), as proved in [40], placing superchannels in one-to-one correspondence with bipartite channels that have a no-signaling constraint.

Using this correspondence, we define the Choi operator of a superchannel $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ with corresponding $B\not\rightarrow A$ no-signaling bipartite channel $\mathcal{L}_{CB\rightarrow AD}$ as

[TABLE]

The fact that $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ preserves completely positivity corresponds to the condition $\Gamma_{R_{C}R_{B}AD}^{\Theta}\geq 0$ , and the fact that $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ preserves trace preservation corresponds to the condition $\Gamma_{R_{C}R_{B}}^{\Theta}=I_{R_{C}R_{B}}$ . The no-signaling condition corresponds to $\Gamma_{R_{C}R_{B}A}^{\Theta}=\Gamma_{R_{C}A}^{\Theta}\otimes\pi_{R_{B}}$ , where $\pi_{R_{B}}$ is the maximally mixed state. Furthermore, as an extension of (199), the Choi operator $\Gamma_{R_{C}D}^{\mathcal{K}}$ for the output channel $\mathcal{K}_{C\rightarrow D}$ of the superchannel $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ , when the input is a channel $\mathcal{N}_{A\rightarrow B}$ with Choi operator $\Gamma_{R_{A}B}^{\mathcal{N}}$ , is as follows:

[TABLE]

This kind of formulation of a superchannel allows for incorporating optimizations over superchannels into semi-definite programs, as we do in Appendix B.

Appendix B General channel box transformation problem as a semi-definite

program

Here we prove the statement claimed at the end of Section IV, that the general channel box transformation problem stated in (19) can be solved by means of a semi-definite program. By employing the Choi representation of superchannels from Appendix A.4, as well as the semi-definite program for the diamond distance in Appendix A.3, we find that (19), as a function of channels $\mathcal{N}_{A\rightarrow B}$ , $\mathcal{M}_{A\rightarrow B}$ , $\mathcal{K}_{C\rightarrow D}$ , and $\mathcal{L}_{C\rightarrow D}$ , can be written as the following semi-definite program:

[TABLE]

subject to

[TABLE]

where we employ the shorthand

[TABLE]

with system $C^{\prime}$ isomorphic to system $C$ and system $A^{\prime}$ isomorphic to system $A$ .

The dual of the semi-definite program in (212)–(213) is given by

[TABLE]

subject to

[TABLE]

By employing strong duality, it follows that the optimal value of (212)–(213) is equal to the optimal value of (217).

Appendix C One-shot distillation and dilution of channel boxes

C.1 Channel min-relative entropy as exact one-shot distillable

distinguishability

To establish (37), we first prove the inequality

[TABLE]

Let $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ be the superchannel that traces out the input $C$ , prepares the pure state $\psi_{RA}$ , transmits $A$ through the unknown channel $\mathcal{N}$ or $\mathcal{M}$ , and then applies the following channel to systems $R$ and $B$ , where $B$ is the output of the unknown channel:

[TABLE]

and $\Pi_{RB}^{\mathcal{N}(\psi)}$ is the projection onto the support of the state $\mathcal{N}_{A\rightarrow B}(\psi_{RA})$ . By construction, if the unknown channel is $\mathcal{N}_{A\rightarrow B}$ , then the channel realized by the superchannel delineated above is the replacer channel $\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|}$ . On the other hand, note that if $\omega_{RB}=\mathcal{M}_{A\rightarrow B}(\psi_{RA})$ is the input to the channel in (223), then the output is the state $\pi_{M}$ , where

[TABLE]

So the output in this latter case is the replacer channel $\mathcal{R}_{C\rightarrow D}^{\pi_{M}}$ . Taking a supremum over all input states $\psi_{RA}$ then establishes the inequality in (222).

The opposite inequality

[TABLE]

follows from the data processing inequality for $D_{\min}(\mathcal{N}\|\mathcal{M})$ under the action of a superchannel. Let $\Theta$ be an arbitrary superchannel satisfying

[TABLE]

Then it follows from (190) that

[TABLE]

where the second-to-last equality follows from (192), given that pairs of replacer channels are environment seizable, and the last equality follows by direct evaluation. Since the exact distillable distinguishability involves an optimization over all superchannels, the inequality in (225) follows, and combined with (222), we conclude (37).

C.2 Channel max-relative entropy as exact one-shot distinguishability

cost

To establish (38), we first prove the inequality

[TABLE]

Recall the characterization of $D_{\max}(\mathcal{N}\|\mathcal{M})$ from (34). Let $\lambda$ be such that

[TABLE]

Then this means that $2^{\lambda}\mathcal{M}_{A\rightarrow B}(\Phi_{RA})-\mathcal{N}_{A\rightarrow B}(\Phi_{RA})\geq 0$ , so that

[TABLE]

is a quantum state. Furthermore, since

[TABLE]

where $\pi_{R}$ is the maximally mixed state on system $R$ , it follows that $\omega_{RA}$ is the Choi state of a quantum channel $\mathcal{N}_{A\rightarrow B}^{\prime}$ , so that

[TABLE]

Furthermore, by linearity, we have that

[TABLE]

Then we construct the superchannel $\Theta_{\left(C\rightarrow D\right)\rightarrow\left(A\rightarrow B\right)}$ as follows. Let $\tau_{C}$ be a fixed state that is input to the unknown replacer channel $\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|}$ or $\mathcal{R}_{C\rightarrow D}^{\pi_{M}}$ , where $M=2^{\lambda}$ . Then we perform the following channel on the output system $D$ and the input system $A$ :

[TABLE]

In the case that the unknown channel is $\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|}$ , the channel realized by this process is $\mathcal{N}_{A\rightarrow B}$ . In the case that the unknown channel is $\mathcal{R}_{C\rightarrow D}^{\pi_{M}}$ , the channel realized by this process is

[TABLE]

demonstrating that

[TABLE]

Now taking an infimum over all $\lambda$ such that (233) holds, we conclude the inequality in (232).

The opposite inequality

[TABLE]

follows from the data processing inequality for $D_{\max}(\mathcal{N}\|\mathcal{M})$ under the action of a superchannel. Let $\Theta$ be an arbitrary superchannel satisfying

[TABLE]

Then it follows from (190) that

[TABLE]

The first equality follows by direct evaluation, and the second follows from (192), given that pairs of replacer channels are environment seizable. Since the exact distinguishability cost involves an optimization over all superchannels, the inequality in (241) follows, and combined with (232), we conclude (38).

C.3 Semi-definite programs for smooth channel min- and max-relative

entropies

In this appendix, we prove that the smooth channel min- and max-relative entropies are characterized by semi-definite programs, starting with the former. We note that Proposition 2 below was also found in [41].

Proposition 2

Let $\mathcal{N}_{A\rightarrow B}$ and $\mathcal{M}_{A\rightarrow B}$ be quantum channels and $\varepsilon\in\left[0,1\right]$ . The smooth channel min-relative entropy is given by the following primal semi-definite program:

[TABLE]

The dual semi-definite program is given by

[TABLE]

Proof. By definition, we have that

[TABLE]

where

[TABLE]

This then means that

[TABLE]

Consider that we can restrict the infimum above to being over all pure states $\psi_{RA}$ such that the reduced state $\psi_{R}$ is positive definite, i.e., $\psi_{R}>0$ , due to the fact that the set of all such states is dense in the set of all pure bipartite states. Note that we can write all such states as $\psi_{RA}=X_{R}\Gamma_{RA}X_{R}^{{\dagger}}$ for some operator $X_{R}$ such that $\left|X_{R}\right|>0$ and $\operatorname{Tr}[X_{R}^{{\dagger}}X_{R}]=1$ . Then it follows that

[TABLE]

where we have defined $\Omega_{RB}:=X_{R}^{{\dagger}}\Lambda_{RB}X_{R}$ and $\rho_{R}:=X_{R}^{{\dagger}}X_{R}$ . Then we can rewrite as

[TABLE]

Again using the fact that the set of positive-definite density operators is dense in the set of all density operators, we conclude (248).

The dual SDP is given by (249), and its optimal value is equal to the optimal value of the primal SDP in (248) by strong duality.

Semi-definite programs for the channel min-relative entropy $D_{\min}(\mathcal{N}\|\mathcal{M})$ are recovered by setting $\varepsilon=0$ in (248) and (249).

Proposition 3

Let $\mathcal{N}_{A\rightarrow B}$ and $\mathcal{M}_{A\rightarrow B}$ be quantum channels and $\varepsilon\in\left[0,1\right]$ . The smooth channel max-relative entropy is given by the following primal semi-definite program:

[TABLE]

The dual semi-definite program is given by

[TABLE]

Proof. The primal form in (257) follows from the SDP formulation of the max-relative entropy and the SDP formulation of the diamond distance of two channels in (207). By definition, we have that

[TABLE]

Considering that

[TABLE]

the optimization in (264) below follows by combining these, with $Y_{RB}$ understood as the Choi operator for the channel $\widetilde{\mathcal{N}}$ being optimized:

[TABLE]

The dual program is given by (258), and its optimal value is equal to the optimal value of (257) by strong duality.

Semi-definite programs for the channel max-relative entropy $D_{\max}(\mathcal{N}\|\mathcal{M})$ are recovered by setting $\varepsilon=0$ in (257) and (258).

C.4 Smooth channel min-relative entropy as approximate one-shot

distillable distinguishability

In order to establish the equality in (44), we first prove the following inequality:

[TABLE]

Let $\psi_{RA}$ be an arbitrary pure state and $\Lambda_{RB}$ a corresponding measurement operator satisfying $0\leq\Lambda_{RB}\leq I_{RB}$ and

[TABLE]

Let $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ be the superchannel that traces out the input $C$ , prepares the pure state $\psi_{RA}$ , transmits system $A$ through the unknown channel $\mathcal{N}$ or $\mathcal{M}$ , and then applies the following channel $\mathcal{P}_{RB\rightarrow X}$ to systems $R$ and $B$ , where $B$ is the output of the unknown channel:

[TABLE]

With this construction, it follows that both $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}(\mathcal{N}_{A\rightarrow B})$ and $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}(\mathcal{M}_{A\rightarrow B})$ are replacer channels, and we find that

[TABLE]

Furthermore, the following equality holds

[TABLE]

for

[TABLE]

The equality in (268) follows from the reasoning in [101, Appendix F-1]. It then follows that

[TABLE]

Optimizing over all such $\psi_{RA}$ and $\Lambda_{RB}$ satisfying the constraints above, we conclude that

[TABLE]

We now prove the opposite inequality:

[TABLE]

Now let $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ be an arbitrary superchannel satisfying

[TABLE]

Then we find that

[TABLE]

The first inequality follows from (190) and the second equality from (275). The last inequality follows from reasoning similar to that in [101, Appendix F-1]. Let $\Delta(\cdot)=|0\rangle\langle 0|(\cdot)|0\rangle\langle 0|+|1\rangle\langle 1|(\cdot)|1\rangle\langle 1|$ denote the completely dephasing channel. Since $\Theta(\mathcal{N})\approx_{\varepsilon}\mathcal{R}_{C\rightarrow D}^{|0\rangle\langle 0|}$ , we find from the data processing inequality for normalized trace distance and an arbitrary input state $\psi_{RC}$ that

[TABLE]

which implies that $\langle 0|\Theta(\mathcal{N})(\psi_{C})|0\rangle\geq 1-\varepsilon$ for all input states $\psi_{RC}$ . Thus, we can take $\Lambda_{RD}=I_{R}\otimes|0\rangle\langle 0|_{D}$ in the definition of $D_{\min}^{\varepsilon}(\Theta(\mathcal{N})\|\mathcal{R}_{C\rightarrow D}^{\pi_{M}})$ , and we have that $\operatorname{Tr}[\Lambda_{RD}\Theta(\mathcal{N})(\psi_{RC})]\geq 1-\varepsilon$ while $\operatorname{Tr}[\Lambda_{RD}\mathcal{R}_{C\rightarrow D}^{\pi_{M}}(\psi_{RC})]=1/M$ for all input states $\psi_{RC}$ . Since $D_{\min}^{\varepsilon}(\Theta(\mathcal{N})\|\mathcal{R}_{C\rightarrow D}^{\pi_{M}})$ involves an optimization over all measurement operators $\Lambda_{RD}$ and states $\psi_{RC}$ satisfying $\operatorname{Tr}[\Lambda_{RD}\Theta(\mathcal{N})(\psi_{RC})]\geq 1-\varepsilon$ , we conclude the inequality in (278). Since the inequality holds for an arbitrary superchannel $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ satisfying (274)–(275), we conclude (273).

Putting together (265) and (273), we conclude the equality in (44), i.e., $D_{\min}^{\varepsilon}(\mathcal{N}\|\mathcal{M})=D_{d}^{\varepsilon}(\mathcal{N},\mathcal{M})$ .

C.5 Smooth channel max-relative entropy as approximate one-shot

distinguishability cost

In order to establish the equality in (45), we first prove the following inequality:

[TABLE]

Let $\widetilde{\mathcal{N}}_{A\rightarrow B}$ be a quantum channel satisfying $\widetilde{\mathcal{N}}_{A\rightarrow B}\approx_{\varepsilon}\mathcal{N}_{A\rightarrow B}$ . Then by constructing a superchannel as we did in Appendix C.2, but for $\widetilde{\mathcal{N}}_{A\rightarrow B}$ instead of $\mathcal{N}_{A\rightarrow B}$ , we conclude the following inequality:

[TABLE]

Then taking the infimum over all such channels $\widetilde{\mathcal{N}}_{A\rightarrow B}$ satisfying $\widetilde{\mathcal{N}}_{A\rightarrow B}\approx_{\varepsilon}\mathcal{N}_{A\rightarrow B}$ , we conclude the inequality in (284).

For the opposite inequality

[TABLE]

let $\Theta$ be an arbitrary superchannel satisfying

[TABLE]

Then consider that

[TABLE]

The second equality follows from (192), given that pairs of replacer channels are environment seizable. The first inequality follows from (190). The last inequality follows from the definition in (43). Since the chain of inequalities holds for all superchannels $\Theta$ satisfying (287)–(288), we conclude (286).

Putting together (284) and (286), we conclude the equality in (45), i.e., $D_{c}^{\varepsilon}(\mathcal{N},\mathcal{M})=D_{\max}^{\varepsilon}(\mathcal{N}\|\mathcal{M})$ .

C.6 Limits of smooth channel min- and max-relative entropy

Here we provide an alternate proof of the limits stated in (48)–(49), starting with (48). These proofs use some of the results from [101, Appendix A-3] as a starting point. Let $\psi_{RA}$ be an arbitrary bipartite state. By the inequality $D_{\min}^{\varepsilon}(\rho\|\sigma)\geq D_{\min}(\rho\|\sigma)$ , which holds for all states $\rho$ and $\sigma$ and $\varepsilon\in(0,1)$ , we conclude that

[TABLE]

for all $\varepsilon\in(0,1)$ . Now taking a supremum over all $\psi_{RA}$ , we find that

[TABLE]

for all $\varepsilon\in(0,1)$ . Taking the limit, we conclude that

[TABLE]

For the other limit, recall the following inequality from [101, Appendix A-3], holding for all states $\rho$ and $\sigma$ , for $\varepsilon\in(0,1)$ , and $\alpha\in(0,1)$ :

[TABLE]

Taking an optimization over all input states $\psi_{RA}$ to the channels $\mathcal{N}_{A\rightarrow B}$ and $\mathcal{M}_{A\rightarrow B}$ , we conclude that

[TABLE]

Taking the limit as $\varepsilon\rightarrow 0$ , we conclude that

[TABLE]

for all $\alpha\in(0,1)$ . Now taking the limit of the left-hand side as $\alpha\rightarrow 0$ , and applying arguments similar to those needed for [33, Lemma 10], we conclude that

[TABLE]

Combining (296) and (300), we conclude the limit stated in (48).

Another proof for the inequality in (49) goes as follows. By taking $\widetilde{\mathcal{N}}=\mathcal{N}$ , we conclude that $\widetilde{\mathcal{N}}\approx_{\varepsilon}\mathcal{N}$ , so that applying definitions gives

[TABLE]

for all $\varepsilon\in(0,1)$ . Then applying a limit gives

[TABLE]

Now suppose that $\widetilde{\mathcal{N}}$ is a channel satisfying $\widetilde{\mathcal{N}}\approx_{\varepsilon}\mathcal{N}$ for $\varepsilon\in(0,1)$ . Then this implies that

[TABLE]

and applying an inequality from [101, Appendix A-3], we find that

[TABLE]

Since this bound holds uniformly for all channels $\widetilde{\mathcal{N}}$ satisfying $\widetilde{\mathcal{N}}\approx_{\varepsilon}\mathcal{N}$ , we conclude that

[TABLE]

Now taking the limit $\varepsilon\rightarrow 0$ , we find that

[TABLE]

Combining (302) and (306), we conclude the inequality in (49).

Appendix D Upper bound on smooth max-relative entropy of classical–quantum

channels

The main purpose of this appendix is to prove Proposition 4, which establishes an upper bound on the smooth max-relative entropy of classical–quantum channels. We begin by noting a simple lemma:

Lemma 1

Let $\{\rho_{B}^{x}\}_{x\in\mathcal{X}}$ and $\{\sigma_{B}^{x}\}_{x\in\mathcal{X}}$ be the output states of classical–quantum channels $\mathcal{N}_{X\rightarrow B}$ and $\mathcal{M}_{X\rightarrow B}$ , respectively, as defined in (82)–(83). Then we have that

[TABLE]

where the latter equality holds for all $\alpha\in[1/2,1)\cup(1,\infty)$ .

Proof. The second equality follows from [19, Lemma 26]. The proof of the first equality is similar to the proof of the second one. For completeness, we provide a proof. Let $\psi_{RX}$ be an arbitrary pure bipartite quantum state ( $X$ is a quantum system here). Then the states resulting from the action of the classical–quantum channels on this state are as follows:

[TABLE]

where

[TABLE]

Then it follows that

[TABLE]

So we have established a uniform upper bound for any possible bipartite input state. The upper bound is achieved by calculating the value of $x$ that achieves the optimum and inputting $|x\rangle\langle x|_{X}$ to the channel box.

The following proposition generalizes [101, Proposition 11] and [1, Theorem 3]. The main proof idea ultimately still has its roots in [63].

Proposition 4

Let $\{\rho_{B}^{x}\}_{x\in\mathcal{X}}$ and $\{\sigma_{B}^{x}\}_{x\in\mathcal{X}}$ be the output states of classical–quantum channels $\mathcal{N}_{X\rightarrow B}$ and $\mathcal{M}_{X\rightarrow B}$ , respectively. Then the following bound holds for all $\alpha>1$ and $\varepsilon\in(0,1)$ :

[TABLE]

Proof. Consider that

[TABLE]

The first inequality follows by restricting the optimization to be over classical–quantum channels. The last equality follows because the objective function $\sum_{x}p(x)\operatorname{Tr}[\Lambda_{B}^{x}\widetilde{\rho}_{B}^{x}]$ is jointly concave with respect to $\{p(x)\}_{x\in\mathcal{X}}$ and $\{\Lambda_{B}^{x}\}_{x\in\mathcal{X}}$ , and it is convex with respect to the states $\{\widetilde{\rho}_{B}^{x}\}_{x\in\mathcal{X}}$ . Also, the sets over which we are optimizing are convex and compact. Thus, the Sion minimax theorem applies [87]. For each operator $\Lambda_{B}^{x}$ , let its spectral decomposition be given as

[TABLE]

Then define the set $\mathcal{S}_{x}$ and the projection $\Pi_{B}^{x}$ as

[TABLE]

The above implies for all $x\in\mathcal{X}$ that

[TABLE]

Then from the data processing inequality of the sandwiched Rényi relative entropy for $\alpha>1$ [44, 13] and by dropping terms, we find that

[TABLE]

which in turn implies that

[TABLE]

for all $x\in\mathcal{X}$ . Then we find that

[TABLE]

for all $x\in\mathcal{X}$ , where

[TABLE]

We define the states

[TABLE]

and we note that $F(\widetilde{\rho}_{B}^{x},\rho_{B}^{x})\geq 1-\varepsilon^{2}$ implies

[TABLE]

for all $x\in\mathcal{X}$ . Then we find that

[TABLE]

So this means that we have the following bound holding for all $x\in\mathcal{X}$ :

[TABLE]

From this, we conclude (317).

Appendix E Bounds for general one-shot or $n$ -shot parallel channel box

transformations

In this appendix, we establish some bounds for general channel box transformations, by generalizing the results of [101] from states to channels. We begin with the following proposition:

Proposition 5

Let $\mathcal{N}_{A\rightarrow B}^{0}$ , $\mathcal{N}_{A\rightarrow B}^{1}$ , and $\mathcal{M}_{A\rightarrow B}$ be channels such that $D_{\max}(\mathcal{N}^{0}\|\mathcal{M})<\infty$ . Then for $\alpha\in(1/2,1)$ and $\beta:=\beta(\alpha)=\alpha/\left(2\alpha-1\right)>1$ , the following inequality holds

[TABLE]

where $\widetilde{D}_{\beta}(\mathcal{N}^{0}\|\mathcal{M})$ and $\widetilde{D}_{\alpha}(\mathcal{N}^{1}\|\mathcal{M})$ are channel sandwiched Rényi relative entropies and $F(\mathcal{N}^{0},\mathcal{N}^{1})$ is the channel fidelity, each of which is defined from (188) and the underlying functions of states.

Proof. Recall the following inequality from [101, Lemma 1] for states $\rho_{0}$ , $\rho_{1}$ , and $\sigma$ :

[TABLE]

Let $\psi_{RA}$ be a pure bipartite state. Applying the above inequality for states, we find that

[TABLE]

Taking a supremum over all input states $\psi_{RA}$ on the left-hand side, and an infimum on the right-hand side, we find that

[TABLE]

Since the above inequality holds for all input states $\psi_{RA}$ , we finally take another supremum to conclude (345).

Proposition 6

Let $\mathcal{N}_{A\rightarrow B}^{0}$ , $\mathcal{N}_{A\rightarrow B}^{1}$ , and $\mathcal{M}_{A\rightarrow B}$ be channels such that $D_{\max}(\mathcal{N}^{0}\|\mathcal{M})<\infty$ . Then for $\alpha\in(0,1)$ and $\beta:=\beta(\alpha)=2-\alpha>1$ , the following inequality holds

[TABLE]

where $D_{\beta}(\mathcal{N}^{0}\|\mathcal{M})$ and $D_{\alpha}(\mathcal{N}^{1}\|\mathcal{M})$ are channel Petz–Rényi relative entropies, each of which is defined from (188) and the underlying functions of states.

Proof. This is a consequence of the following inequality from [101, Lemma 4], for states $\rho_{0}$ , $\rho_{1}$ , and $\sigma$ , and the same reasoning as in the proof of Proposition 5:

[TABLE]

concluding the proof.

We can then use the above bounds for channels to establish converse bounds for general channel box transformation protocols.

Proposition 7

Let $\mathcal{N}_{A\rightarrow B}$ , $\mathcal{M}_{A\rightarrow B}$ , $\mathcal{K}_{C\rightarrow D}$ , $\mathcal{L}_{C\rightarrow D}$ be quantum channels, and let $\Theta_{\left(A\rightarrow B\right)\rightarrow\left(C\rightarrow D\right)}$ be a superchannel such that $\Theta(\mathcal{M})=\mathcal{L}$ . For $\alpha\in(1/2,1)$ and $\beta:=\beta(\alpha)=\alpha/\left(2\alpha-1\right)>1$ , the following inequality holds

[TABLE]

and for $\alpha^{\prime}\in(0,1)$ and $\beta^{\prime}:=\beta^{\prime}(\alpha^{\prime}):=2-\alpha^{\prime}\in(1,2)$ , the following inequality holds

[TABLE]

Proof. As a consequence of the data processing inequality for channel divergences with respect to superchannels, we find that

[TABLE]

where the last inequality follows from Proposition 5.

The inequality in (352) follows similarly from data processing but then using Proposition 6.

We can now use these one-shot bounds to establish converse bounds on the rate at which it is possible to convert the $n$ -fold channel box $(\mathcal{N}^{\otimes n},\mathcal{M}^{\otimes n})$ to the $m$ -fold channel box $(\mathcal{K}^{\otimes m},\mathcal{L}^{\otimes m})$ .

Proposition 8

Let channels $\mathcal{N}_{A\rightarrow B}$ , $\mathcal{M}_{A\rightarrow B}$ , $\mathcal{K}_{C\rightarrow D}$ , $\mathcal{L}_{C\rightarrow D}$ be given and suppose that there exists an $(n,m,\varepsilon)$ channel box transformation protocol (i.e., a superchannel $\Theta^{(n)}$ such that $\Theta^{(n)}(\mathcal{N}^{\otimes n})\approx_{\varepsilon}\mathcal{K}^{\otimes m}$ and $\Theta^{(n)}(\mathcal{M}^{\otimes n})=\mathcal{L}^{\otimes m}$ ). Then for $\alpha\in(1/2,1)$ and $\beta:=\beta(\alpha)=\alpha/\left(2\alpha-1\right)>1$ , the following bound holds

[TABLE]

For $\alpha^{\prime}\in(0,1)$ and $\beta^{\prime}:=\beta^{\prime}(\alpha^{\prime}):=2-\alpha^{\prime}\in(1,2)$ , the following bound holds

[TABLE]

In the above,

[TABLE]

with a similar definition for the other quantities.

Proof. Applying Proposition 7, we conclude that

[TABLE]

The second inequality follows from the fact that

[TABLE]

The other inequality follows from similar reasoning but instead using data processing and (352).

Corollary 1

Let $(\mathcal{N},\mathcal{M})$ and $(\mathcal{K},\mathcal{L})$ be channel boxes such that

[TABLE]

for $n,m\geq 1$ , $\alpha\in(1/2,1)$ , and $\beta:=\beta(\alpha)=\alpha/\left(2\alpha-1\right)>1$ . Then the following bound applies to an $(n,m,\varepsilon)$ general channel box transformation protocol:

[TABLE]

Alternatively, suppose that $(\mathcal{N},\mathcal{M})$ and $(\mathcal{K},\mathcal{L})$ satisfy

[TABLE]

for $n,m\geq 1$ , $\alpha^{\prime}\in(0,1)$ , and $\beta^{\prime}:=\beta^{\prime}(\alpha^{\prime}):=2-\alpha^{\prime}\in(1,2)$ . Then the following bound holds

[TABLE]

Proof. This is a direct consequence of Proposition 8 and the additivity relations assumed in (362)–(363) and (365)–(366).

Remark 2

The desired additivity relations in (362)–(363) and (365)–(366) hold for channel boxes that are classical–quantum or environment seizable [19]. Thus, by applying reasoning similar to that given in [101, Appendix J], we conclude the following strong converse bound for these channel boxes:

[TABLE]

The lower bound (achievability)

[TABLE]

follows from combining a distillation protocol with a dilution protocol for these channel boxes, as well as reasoning similar to that given in [101, Appendix J], and along with the fact that the rates $D(\mathcal{N}\|\mathcal{M})$ and $D(\mathcal{K}\|\mathcal{L})$ are achievable for these tasks and these channel boxes. Thus, the asymptotic parallel box transformation problem has a simple solution for these channel boxes.

Appendix F Bounding the smooth channel max-relative entropy in terms of channel

relative entropies

In this appendix, we provide lower bounds for the smooth channel max-relative entropy in terms of the channel sandwiched and Petz–Rényi relative entropies.

Proposition 9

Let $\mathcal{N}_{A\rightarrow B}$ and $\mathcal{M}_{A\rightarrow B}$ be quantum channels. Then the following bound holds for all $\alpha\in[1/2,1)$ and $\varepsilon\in[0,1)$ :

[TABLE]

Proof. First fix $\alpha\in(1/2,1)$ . Then pick $\widetilde{\mathcal{N}}$ to be a channel such that $\widetilde{\mathcal{N}}\approx_{\varepsilon}\mathcal{N}$ . We find for $\beta:=\alpha/(2\alpha-1)$ that

[TABLE]

The first inequality follows from the fact that the sandwiched Rényi relative entropies are monotone [77] and $\lim_{\alpha\to\infty}\widetilde{D}_{\alpha}=D_{\max}$ [77]. The second inequality follows from Proposition 5. The final inequality follows because

[TABLE]

Since the bound holds for an arbitrary channel $\widetilde{\mathcal{N}}$ satisfying $\widetilde{\mathcal{N}}\approx_{\varepsilon}\mathcal{N}$ , we conclude (370). The inequality for $\alpha=1/2$ follows by taking a limit.

Another lower bound on the smooth channel max-relative entropy is as follows:

Proposition 10

Let $\mathcal{N}_{A\rightarrow B}$ and $\mathcal{M}_{A\rightarrow B}$ be quantum channels. Then the following bound holds for all $\alpha\in[0,1)$ and $\varepsilon\in[0,1)$ :

[TABLE]

Proof. First fix $\alpha\in(0,1)$ . Then pick $\widetilde{\mathcal{N}}$ to be a channel such that $\widetilde{\mathcal{N}}\approx_{\varepsilon}\mathcal{N}$ . We find for $\beta:=2-\alpha$ that

[TABLE]

The first inequality follows from the fact that $D_{\max}\geq D_{2}$ [64, 101] and the Petz–Rényi relative entropies are monotone with respect to $\beta$ [89]. The second inequality follows from Proposition 6. Since the bound holds for an arbitrary channel $\widetilde{\mathcal{N}}$ satisfying $\widetilde{\mathcal{N}}\approx_{\varepsilon}\mathcal{N}$ , we conclude (376). The inequality for $\alpha=0$ follows by taking a limit.

Appendix G Quantum strategy and sequential channel box transformations

G.1 Bounds for general $n$ -round strategy box transformations

In this appendix, we provide bounds for general $n$ -round strategy box transformations. These bounds are similar in some regards to those given in Appendix E, following essentially the same line of reasoning to establish them.

Proposition 11

Let $\mathcal{N}^{(n)}$ , $\mathcal{L}^{(n)}$ , and $\mathcal{M}^{(n)}$ be quantum strategies such that $D_{\max}(\mathcal{N}^{(n)}\|\mathcal{M}^{(n)})<\infty$ . Then for $\alpha\in(1/2,1)$ and $\beta:=\beta(\alpha)=\alpha/\left(2\alpha-1\right)>1$ , the following inequality holds

[TABLE]

where $F(\mathcal{N}^{(n)},\mathcal{L}^{(n)})$ is the strategy fidelity of [50].

Proof. Recall the following inequality from [101, Lemma 1] for states $\rho_{0}$ , $\rho_{1}$ , and $\sigma$ :

[TABLE]

Applying the above inequality for states, and defining $\tau_{R_{n}B_{n}}$ from $\mathcal{L}^{(n)}$ in an analogous fashion to $\rho_{R_{n}B_{n}}$ and $\sigma_{R_{n}B_{n}}$ in (107) and (108), respectively, we find that

[TABLE]

Taking a supremum over all co-strategies $\{\rho_{R_{1}A_{1}},\{\mathcal{A}_{R_{i}B_{i}\rightarrow R_{i+1}A_{i+1}}^{i}\}_{i=1}^{n-1}\}$ on the left-hand side, and an infimum on the right-hand side, we find that

[TABLE]

Since the above inequality holds for all co-strategies $\{\rho_{R_{1}A_{1}},\{\mathcal{A}_{R_{i}B_{i}\rightarrow R_{i+1}A_{i+1}}^{i}\}_{i=1}^{n-1}\}$ , we finally take another supremum to conclude (380).

Proposition 12

Let $\mathcal{N}^{(n)}$ , $\mathcal{L}^{(n)}$ , and $\mathcal{M}^{(n)}$ be quantum strategies such that $D_{\max}(\mathcal{N}^{(n)}\|\mathcal{M}^{(n)})<\infty$ . Then for $\alpha\in(0,1)$ and $\beta:=\beta(\alpha)=2-\alpha>1$ , the following inequality holds

[TABLE]

where $\left\|\mathcal{N}^{(n)}-\mathcal{L}^{(n)}\right\|_{\diamond n}$ denotes the quantum strategy distance of [27, 29, 52].

Proof. This is a consequence of the following inequality from [101, Lemma 4], for states $\rho_{0}$ , $\rho_{1}$ , and $\sigma$ , and the same reasoning as in the proof of Proposition 11:

[TABLE]

concluding the proof.

We can then use the above bounds for quantum strategies to establish converse bounds for general strategy box transformation protocols.

Proposition 13

Let $\mathcal{N}^{(n)}$ , $\mathcal{M}^{(n)}$ , $\mathcal{K}^{(m)}$ , $\mathcal{L}^{(m)}$ be quantum strategies, and let $\Theta^{(n\rightarrow m)}$ be a physical transformation such that $\Theta^{(n\rightarrow m)}(\mathcal{M}^{(n)})=\mathcal{L}^{(m)}$ . For $\alpha\in(1/2,1)$ and $\beta:=\beta(\alpha)=\alpha/\left(2\alpha-1\right)>1$ , the following inequality holds

[TABLE]

For $\alpha^{\prime}\in(0,1)$ and $\beta^{\prime}:=\beta^{\prime}(\alpha^{\prime}):=2-\alpha^{\prime}\in(1,2)$ , the following inequality holds

[TABLE]

Proof. As a consequence of the data processing inequality for the quantum strategy divergence with respect to physical transformations (Theorem 4), we find that

[TABLE]

where the last inequality follows from Proposition 11.

The inequality in (387) follows similarly from data processing but then using Proposition 12.

We can now use these bounds to establish converse bounds on the rate at which it is possible to convert the quantum strategy box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ to the strategy box $(\mathcal{K}^{(m)},\mathcal{L}^{(m)})$ .

Proposition 14

Let quantum strategies $\mathcal{N}^{(n)}$ , $\mathcal{M}^{(n)}$ , $\mathcal{K}^{(m)}$ , $\mathcal{L}^{(m)}$ be given and suppose that there exists an $(n,m,\varepsilon)$ strategy box transformation protocol (i.e., a physical transformation $\Theta^{(n\rightarrow m)}$ such that $\Theta^{(n\rightarrow m)}(\mathcal{N}^{(n)})\approx_{\varepsilon}\mathcal{K}^{(m)}$ and $\Theta^{(n\rightarrow m)}(\mathcal{M}^{(n)})=\mathcal{L}^{(m)}$ ). Then for $\alpha\in(1/2,1)$ and $\beta:=\beta(\alpha)=\alpha/\left(2\alpha-1\right)>1$ , the following bound holds

[TABLE]

For $\alpha^{\prime}\in(0,1)$ and $\beta^{\prime}:=\beta^{\prime}(\alpha^{\prime}):=2-\alpha^{\prime}\in(1,2)$ , the following bound holds

[TABLE]

Proof. Applying Proposition 13, we conclude that

[TABLE]

The second inequality follows from the fact that [50]

[TABLE]

The other inequality follows from similar reasoning but instead using data processing and (387).

Corollary 2

Let $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ and $(\mathcal{K}^{(m)},\mathcal{L}^{(m)})$ be sequential channel boxes such that

[TABLE]

for $n,m\geq 1$ , $\alpha\in(1/2,1)$ , and $\beta:=\beta(\alpha)=\alpha/\left(2\alpha-1\right)>1$ . Then the following bound applies to an $(n,m,\varepsilon)$ general channel box transformation protocol:

[TABLE]

Alternatively, suppose that $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ and $(\mathcal{K}^{(m)},\mathcal{L}^{(m)})$ satisfy

[TABLE]

for $n,m\geq 1$ , $\alpha^{\prime}\in(0,1)$ , and $\beta^{\prime}:=\beta^{\prime}(\alpha^{\prime}):=2-\alpha^{\prime}\in(1,2)$ . Then the following bound holds

[TABLE]

Proof. This is a direct consequence of Proposition 14 and the relations assumed in (397)–(398) and (400)–(401).

G.2 Sequential channel box transformations and amortized channel

divergence

In [19], the notion of amortized channel divergence of a channel box $(\mathcal{N},\mathcal{M})$ was introduced as

[TABLE]

where the optimization is with respect to input states $\rho_{RA}$ and $\sigma_{RA}$ , and the system $R$ has unbounded dimension. The intuition behind this quantity is that it represents the largest net distinguishability that can be generated by the channels $\mathcal{N}$ and $\mathcal{M}$ if we are allowed to start with some distinguishability to begin with, in the form of the state box $(\rho_{RA},\sigma_{RA})$ .

Suppose now that we have a sequential channel box $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ , where $\mathcal{N}^{(n)}$ consists of a sequence of $n$ uses of $\mathcal{N}$ and $\mathcal{M}^{(n)}$ consists of a sequence of $n$ uses of $\mathcal{M}$ . As stated earlier, this sequential channel box is a special kind of strategy box. Then by employing the same reasoning as in the proof of [19, Lemma 14], we conclude that the amortized channel divergence is an upper bound on the normalized strategy divergence of $(\mathcal{N}^{(n)},\mathcal{M}^{(n)})$ :

[TABLE]

For some channel boxes and choices of divergences, the inequality in (404) is saturated as a consequence of the amortized channel divergence collapsing to the usual channel divergence [19]. This occurs for all classical–quantum or environment-seizable channel boxes paired up with the Petz–Rényi relative entropy, the sandwiched Rényi relative entropy, or the quantum relative entropy [19]. Thus, for these channels, the desired relations in (397)–(398) and (400)–(401) hold, so that the bounds in (399) and (402) hold for these channels.

Bibliography105

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1ABJT [19] Anurag Anshu, Mario Berta, Rahul Jain, and Marco Tomamichel. A minimax approach to one-shot entropy inequalities. June 2019. 1906.00333 v 1.
2Aci [01] Antonio Acin. Statistical distinguishability between unitary operations. Physical Review Letters , 87(17):177901, October 2001. ar Xiv:quant-ph/0102064.
3AU [80] P. M. Alberti and A. Uhlmann. A problem relating to positive linear maps on matrix algebras. Reports on Mathematical Physics , 18(2):163–176, October 1980.
4Ba HN + [15] Fernando G. S. L. Brandão, Michal Horodecki, Nelly Ng, Jonathan Oppenheim, and Stephanie Wehner. The second laws of quantum thermodynamics. Proceedings of the National Academy of Sciences , 112(11):3275–3279, March 2015. ar Xiv:1305.5278.
5BBCW [13] Mario Berta, Fernando G. S. L. Brandão, Matthias Christandl, and Stephanie Wehner. Entanglement cost of quantum channels. IEEE Transactions on Information Theory , 59(10):6779–6795, October 2013. ar Xiv:1108.5357.
6BCR [11] Mario Berta, Matthias Christandl, and Renato Renner. The quantum reverse Shannon theorem based on one-shot information theory. Communications in Mathematical Physics , 306(3):579–615, August 2011. ar Xiv:0912.3805.
7BD [10] Francesco Buscemi and Nilanjana Datta. The quantum capacity of channels with arbitrarily correlated noise. IEEE Transactions on Information Theory , 56(3):1447–1460, March 2010. ar Xiv:0902.0158.
8BD [11] Fernando G. S. L. Brandao and Nilanjana Datta. One-shot rates for entanglement manipulation under non-entangling maps. IEEE Transactions on Information Theory , 57(3):1754–1760, March 2011. ar Xiv:0905.2673.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Resource theory of asymmetric distinguishability for quantum channels

Abstract

I Introduction

II Summary of results

III Resource theory of asymmetric distinguishability for quantum

III.1 Environment-parametrized and -seizable channels

III.2 Superchannels as transformations of channel boxes

IV General channel box transformation problem

V One-shot distillation and dilution of quantum channel boxes

V.1 Exact one-shot distillation and dilution of quantum channel boxes

Theorem 1

V.2 Approximate one-shot distillation and dilution of quantum channel

Theorem 2

VI Parallel nnn-shot distillation and dilution of quantum channel boxes

VI.1 Exact case: distillable distinguishability

VI.2 Exact case: distinguishability cost

VI.3 Approximate case: distillable distinguishability

Theorem 3

VI.4 Approximate case: distinguishability cost

Proposition 1

VII General channel box transformation: Parallel case

VIII General box transformation: Sequential channels and quantum

VIII.1 Quantum strategies and sequential channel boxes

Definition 1** (Generalized q. strategy divergence)**

VIII.2 Physical transformations of quantum strategy boxes and data

Theorem 4

Remark 1

VIII.3 Quantum strategy box transformation problem

VIII.4 Asymptotic setting for sequential channel box transformations

IX Distillation and dilution of quantum strategy and sequential channel

IX.1 Exact case: distillable distinguishability

IX.2 Exact case: distinguishability cost

Theorem 5

IX.3 Approximate case: distillable distinguishability

Theorem 6

IX.4 Approximate case: distinguishability cost

X Conclusion

Acknowledgements.

Appendix A Background

A.1 Generalized divergences

A.2 Choi isomorphism for quantum channels

A.3 Semi-definite programs for diamond distance

A.4 Choi isomorphism for quantum superchannels

Appendix B General channel box transformation problem as a semi-definite

Appendix C One-shot distillation and dilution of channel boxes

C.1 Channel min-relative entropy as exact one-shot distillable

C.2 Channel max-relative entropy as exact one-shot distinguishability

C.3 Semi-definite programs for smooth channel min- and max-relative

Proposition 2

Proposition 3

C.4 Smooth channel min-relative entropy as approximate one-shot

C.5 Smooth channel max-relative entropy as approximate one-shot

C.6 Limits of smooth channel min- and max-relative entropy

Appendix D Upper bound on smooth max-relative entropy of classical–quantum

Lemma 1

Proposition 4

Appendix E Bounds for general one-shot or nnn-shot parallel channel box

Proposition 5

Proposition 6

Proposition 7

Proposition 8

Corollary 1

Remark 2

Appendix F Bounding the smooth channel max-relative entropy in terms of channel

Proposition 9

Proposition 10

Appendix G Quantum strategy and sequential channel box transformations

G.1 Bounds for general nnn-round strategy box transformations

Proposition 11

Proposition 12

Proposition 13

Proposition 14

Corollary 2

G.2 Sequential channel box transformations and amortized channel

VI Parallel $n$ -shot distillation and dilution of quantum channel boxes

Definition 1 (Generalized q. strategy divergence)

Appendix E Bounds for general one-shot or $n$ -shot parallel channel box

G.1 Bounds for general $n$ -round strategy box transformations