Efficient accessible bounds to the classical capacity of quantum channels
Chiara Macchiavello, Massimiliano F. Sacchi

TL;DR
This paper introduces a practical method to estimate lower bounds on the classical capacity of quantum channels using minimal measurements and classical optimization, without prior channel knowledge.
Contribution
It provides a novel approach to assess quantum channel capacity efficiently without full process tomography or prior information.
Findings
Effective for various noisy quantum channels
Requires only local measurements and classical optimization
Does not depend on prior channel information
Abstract
We present a method to detect lower bounds to the classical capacity of quantum communication channels by means of few local measurements (i.e. without complete process tomography), reconstruction of sets of conditional probabilities, and classical optimisation. The method does not require any a priori information about the channel. We illustrate its performance for significant forms of noisy channels.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Efficient accessible bounds to the classical capacity of quantum channels
Chiara Macchiavello
Quit group, Dipartimento di Fisica, Università di Pavia, via A. Bassi 6, I-27100 Pavia, Italy
Istituto Nazionale di Fisica Nucleare, Gruppo IV, via A. Bassi 6, I-27100 Pavia, Italy
Massimiliano F. Sacchi
Istituto di Fotonica e Nanotecnologie - CNR, Piazza Leonardo da Vinci 32, I-20133, Milano, Italy
Quit group, Dipartimento di Fisica, Università di Pavia, via A. Bassi 6, I-27100 Pavia, Italy
Abstract
We present a method to detect lower bounds to the classical capacity of quantum communication channels by means of few local measurements (i.e. without complete process tomography), reconstruction of sets of conditional probabilities, and classical optimisation. The method does not require any a priori information about the channel. We illustrate its performance for significant forms of noisy channels.
The classical capacity of a noisy quantum communication channel quantifies the maximum amount of classical information per channel use that can be reliably transmitted NC00 . In general, its computation is a hard task, since it requires a regularisation procedure over an infinite number of channel uses hol0 ; sw ; hol , and it is therefore by itself not directly accessible experimentally. Its analytical value is known mainly for some channels that have the property of additivity, since regularisation is not needed in this case. In fact, in such case the problem is recast to the evaluation of the Holevo capacity hol0 ; sw ; hol , which is a single-letter expression quantifying the maximum information when only product states are sent through the uses of the channel.
When a complete knowledge of the channel is available, then several methods can be used to calculate the Holevo capacity eff ; eff2 ; eff3 ; eff4 ; eff5 , which is always a lower bound to the ultimate capacity of the channel. In many practical situations, however, a complete knowledge of the kind of noise present along the channel is not available, and sometimes noise can be completely unknown. It is then important to develop efficient means to establish whether in these situations the channel can still be profitably employed for information transmission. A standard method to establish this relies on quantum process tomography nielsen97 ; pcz ; mls ; dlp ; alt ; cnot ; vibr ; ion ; mohseni ; irene ; atom , where a complete reconstruction of the completely positive map describing the action of the channel can be achieved, but it is a demanding procedure in terms of the needed number of different measurement settings, since it scales as for a finite -dimensional quantum system.
When one is not interested in reconstructing the complete form of the noise but only in detecting lower bounds to the classical capacity a novel and less demanding procedure in terms of measurements is presented in this Letter. In the same spirit as it is done, for example, for detection of entanglement-breaking property qchanndet ; qchanndet2 or non-Markovianity nomadet of quantum channels, or for detection of lower bounds to the quantum capacity ms16 , the method we present allows to experimentally detect lower bounds to the classical capacity by means of a number of local measurements that scales at most as . The method proposed in Ref. ms16 for detecting bounds to the quantum capacity can be applied to generally unknown noisy channels and has been proved to be very successful for many examples of single qubit channels, for generalized Pauli channels in arbitrary dimension, and for two-qubit memory Pauli and amplitude damping channels ms-corr . Moreover, the first experimental demonstration has been also recently shown in Ref. exp , based on a quantum optical implementation for various forms of noisy channels.
In principle, since the procedure of Ref. ms16 allows to experimentally reconstruct bounds on the quantum capacity , private capacity , and entangled-assisted capacity , and since for the classical capacity generally one has mixed ; chain the chain of inequalities , the method to measure also gives a bound to the classical capacity. However, unfortunately, such a bound is in general very loose. In the present Letter we will show a more effective method constructed specifically to measure a lower bound to the classical capacity.
Quantum channels are described by completely positive and trace preserving maps , which can be expressed in the Kraus form NC00 as , where is the density operator of the quantum system on which the channel acts and the Kraus operators fulfill the constraint . The classical capacity of the channel quantifies the maximum number of bits per channel use that can be reliably transmitted through the noisy quantum channel. It is defined hol0 ; sw ; hol by the regularized expression , in terms of the Holevo capacity
[TABLE]
where the maximum is over all possible ensembles of quantum states, and denotes the von Neumann entropy (we use logarithm to the base ). The Holevo capacity is a lower bound for the channel capacity, and corresponds to the maximum information when only product states are sent through the uses of the channel, whereas joint (entangled) measurements are allowed at the output. Then, clearly, the Holevo capacity is also an upper bound for any expression of the mutual information mutu ; mutu2 ; mutu3
[TABLE]
where the probability transition matrix corresponds to the conditional probability of an arbitrary measurement with outcome at the output for a single use of the channel with input , and denotes an arbitrary prior probability, which describes the distribution of the encoded alphabet.
Here, we are interested to detect a lower bound to the capacity when the number of measurement settings is smaller than the one needed for complete process-tomography, going from a single one to . The scenario we will focus on to achieve our detection strategy consists in the following steps: prepare a bipartite maximally entangled state of a system qudit and a noiseless reference qudit, and send it through the channel , where the unknown channel acts on the system qudit alone. Then measure locally a number of observables of the form , where represents the transposition w.r.t. to the fixed basis defined by .
By denoting the eigenvectors of as and using the identity pla
[TABLE]
it is straightforward to see that the measurement protocol allows us to reconstruct the conditional probabilities . We can then write the optimal mutual information for the encoding-decoding scheme as
[TABLE]
Then, one has the following chain of inequalities
[TABLE]
where is the experimentally accessible bound to the classical capacity, which depends on the chosen set of measured observables labeled by . It is then clear that the bound improves by increasing the number of performed measurements.
We want to point out that such a detection method based on the measurements of the local operators does not necessarily require the use of an entangled bipartite state at the input. Actually, each conditional probability can be obtained equivalently by considering only the system qudit, preparing it in the eigenstates of with equal probabilities, and measuring at the output of the channel. We notice also that if one has the possibility of optimizing over all input ensembles, whereas the output measurements are fixed, the problem is recast to evaluate the informational power of the noisy quantum measurements dalla corresponding to the Heisenberg evolution , and then maximise over .
The maximisation over the set of prior probabilities in Eq. (4) for each can be achieved by means of the Blahut-Arimoto recursive algorithm bga1 ; bga2 ; bga3 , given by
[TABLE]
Starting from an arbitrary prior , this guarantees convergence to an optimal prior, thus providing the value of for each with the desired accuracy. For completeness we mention that a slight modification of the recursive algorithm (6) can also accommodate possible constraints of the channel, e.g. the allowed maximum energy in lossy Bosonic channels loss .
We remind that in some special forms of transition matrices there is no need of numerical maximisation, since the optimal prior is known. This is the case of a conditional probability corresponding to a weakly symmetric channel CT , where every column is a permutation of each other and all sums are equal. In fact, in such case the corresponding optimal prior is the uniform , and the mutual information is given by , where denotes the Shannon entropy and therefore is the Shannon entropy of an arbitrary column (since all columns have the same entropy).
A further example is the case of the transition matrices obtained by preparation and measurements on the eigenstates of the generalized Pauli (or Weyl) matrices , with , and the channel is a generalized Pauli channel bellobs . In fact, when is prime, for each all columns (and rows) of are permutations of each other weyl , i.e. one has a symmetric channel, and then each optimal prior is uniform CT . In this case, then, the detected capacity will be given by
[TABLE]
Actually, the present result has been recently used to provide efficient detectable bounds to the Holevo capacity of generalized Pauli channels weyl2 .
As a final example we mention the case of binary channels, where are transition matrices, and the maximisation in Eq. (4) can be analytically found (see the Supplemental Material).
In order to study in more detail the case of qubit channels, we use their representation on the Bloch sphere. A qubit channel maps the sphere of possible input states to an ellipsoid, and may be expressed as ellip ; ellip2
[TABLE]
where denotes the Bloch vector of the input state , is a real matrix, and is the output Bloch vector for input . That is, the channel produces the affine mapping . Via local unitary operations acting before and after the map, the transformation matrices and may be brought to the form
[TABLE]
namely, an arbitrary qubit channel may be expressed as , where and are unitary channels (thus not varying the capacity), and is the channel with and given by Eq. (9). So the Bloch sphere is mapped to an ellipsoid with principal axes parallel to the Cartesian axes, and center shifted by the vector . We recall that the condition of complete positivity puts some constraints on the admissible values of ’s and ’s algoe ; ellip ; ellip2 ; braun .
Let us consider our detection scheme for channels of the form and measurements of the Pauli operators. The conditional probabilities of outcomes for the measurement of at the output of a channel given the input states and are given respectively by
[TABLE]
Upon defining and , each of the three measurements then provides a transition matrix of a binary channel, and the detected bound for the classical capacity takes the form
[TABLE]
with given by Eq. (A.4) in the Supplemental Material.
The case of Pauli channels corresponds to the unital case, i.e. , for which , , , and . Hence, the detected capacity in this case is given by
[TABLE]
which equals the Holevo and the classical capacity (since the additivity hypothesis holds true for unital qubit channels unital ), i.e. , and just corresponds to the result of Eq. (7) for .
It is useful to recall the following results from pseudo , where the notion of pseudoclassical channel is introduced. In a pseudoclassical channel the Holevo capacity is achieved without quantum correlations in the measurements between different uses of the channel. In other words, for pseudoclassical channels the Holevo capacity can be attained when the optimal measurement with a single transmission is performed on each system. A channel is pseudoclassical iff the capacity is achieved by an ensemble of input states such that the corresponding output states are mutually commuting. All unital qubit channels are pseudoclassical pseudo and, as we have seen, . In the nonunital case, a channel (8) is pseudoclassical iff pseudo ; hay
i) is an eigenvector of with eigenvalue , and
ii) by denoting as the maximum of the other two eigenvalues, one has
[TABLE]
where . Geometrically, condition i) requires that the direction of the shift must be parallel to one of the principal axes, and condition ii) requires that the ellipsoid must be sufficiently thin around this direction. For qubit nonunital pseudoclassical channels the Holevo capacity is given by pseudo
[TABLE]
namely it corresponds to the capacity of a classical binary asymmetric channel.
In the following we consider the case of Eq. (9) with and , which are the most studied non-unital channels in the literature eff3 ; nata ; berry ; daems . In this case the condition of complete positivity is equivalent algoe ; ellip ; ellip2 to the constraints . Note that the condition of pseudoclassicality (13) rewrites as
[TABLE]
When this condition is satisfied the detected capacity provides the exact value of the Holevo capacity, namely
[TABLE]
which is achieved by the two orthogonal eigenstates of as input states. This is also due to the fact that the corresponding output states are commuting, and hence the Holevo information is saturated by the projective measurement on eigenstates fuchs . On the other hand, when condition (15) is not satisfied, one has , since typically the capacity is achieved by an ensemble of two nonorthogonal input states, or even a three- or four-state ensemble nata ; eff3 , while only input orthogonal states are tested by our detection method. In the following we test our method on further explicit examples.
Example: generalized amplitude damping channel, for which , , and , where both and . This channel describes qubit dynamics with exchanges of excitations with the thermal environment at finite temperature NC00 ; turch ; turch2 . Then one has , and , . It follows that the detected classical capacity is given by
[TABLE]
We numerically checked that the maximum is always achieved by the first term in Eq. (17), which is independent of the value of . In fact, one can also check that condition (15) is never satisfied, and so the channel is never pseudoclassical (except the degenerate case , when the channel becomes unital). Hence, the detected capacity is strictly lower than the Holevo capacity. For , corresponding to the customary amplitude damping channel at zero temperature, one can compute the Holevo capacity according to the following equation gf ; gf2
[TABLE]
with . Clearly, provides a known lower bound to the classical capacity. In Fig. 1 we plot the detected capacity versus the damping parameter , along with the Holevo capacity . Notice that , which corresponds to the capacity of a Pauli channel.
Further examples corresponding to a stretched damping channel nata and to extremal qubit channels ellip2 ; braun are reported in the Supplemental Material.
Our method can also be successfully applied to other forms of amplitude damping processes. For instance, when , we can consider the decay processes of a three-level system in -shaped configuration qutrit . The Kraus operators take the form
[TABLE]
with both and . We consider the case where only two projective measurements are used to detect the classical capacity, namely the two mutually unbiased bases
[TABLE]
with . The corresponding transition matrices and are reported in the Supplemental Material. Notice that corresponds to a symmetric channel, and hence the maximal mutual information is given by the analytical expression , with as in Eq. (A.12). The maximisation of the mutual information pertaining to can be obtained by the algorithm of Eq. (6), thus giving . In Fig. 2 we plot the detected capacity, corresponding to , versus the damping parameters and .
The present example shows that our method allows to derive lower bounds to the classical capacity of quantum channels which are even theoretically poorly studied. In addition, our method provides also the explicit form of the encoding corresponding to the detected lower bound. Before concluding, we notice that when the measurement bases are badly matched with respect to the structure of the unknown channel, then the detected capacity may give a bound to the classical capacity which becomes looser. An example is provided in the Supplemental Material, for the case of a dephasing channel for qubits on an unknown basis. Moreover, we want to point out that there are situations where a priori information is available on the form of the channel, but quantum process tomography cannot be performed because the number of allowed measurements is not enough. A simple example is the case of a channel whose structure is known to be of the form as in Eq. (9), but only two measurements are available. On the other hand, in the same scenario, our method provides a significant detected capacity. A further example is given by a Pauli channel followed by a phase rotation, which is explicitly reported in the Supplemental Material, along with a quantitative analysis of the tradeoff between the available a priori information on the probability distribution for the value of the unknown phase and the performance of our detection method.
In summary, we presented an efficient method to detect lower bounds to the classical capacity of noisy quantum channels with few local measurements, by testing orthogonal ensembles and output measurements, along with classical optimisation algorithms. The method can be applied to completely unknown quantum channels and to all situations where quantum process tomography is not available. The scheme we presented can be easily implemented in the lab with present day technology, e.g. as in Ref. exp .
Appendix A SUPPLEMENTAL MATERIAL
Capacity for classical binary asymmetric channels
Without loss of generality we can fix the labeling of zeros and ones in a classical binary channel such that , , and , where denotes the error probability of receiving for input [math] and denotes the error probability of receiving [math] for input . The Shannon capacity of the binary channel is given by the maximum over the prior probability of the mutual information
[TABLE]
From the condition and straightforward algebra one achieves
[TABLE]
where
[TABLE]
By substituting the optimal value (2) of in Eq. (1) and simplifying, one obtains the capacity
[TABLE]
In the limiting case one recovers the classical capacity for the binary symmetric channel
[TABLE]
For , only input 1 is affected by error, and one obtains the capacity of the so-called -channel
[TABLE]
Detected capacity for a stretched damping channel
A stretched damping channel can be specified by the parameters , , and , with and . Applying Eq. (11) of the main text, one obtains
[TABLE]
In Fig. 1. we plot the detected capacity for damping parameter , versus the respective allowed values of the stretching parameter , along with the Holevo capacity (obtained numerically). In this case , and then from Eqs. (15) and (16) of the main text it follows that the channel is pseudoclassical for , for which .
Extremal qubit channels
We consider here extremal qubit channels, which, up to rotations, can be parametrized as , , , and , with . It follows that , , , and . Since is monotonically increasing in , one easily finds that the detected capacity is given by
[TABLE]
We numerically checked that the first term is always the maximizer in Eq. (8), for all values of and . In fact, numerically one also can check that these channels never satisfy the pseudoclassicality condition in Eq. (15) of the main text, except for the degenerate unital case of or , otherwise one would have for some values of and .
Transition matrices for a V-shaped qutrit channel
A decay process for a three-level system in -shaped configuration is depicted in Fig. 2.
The transition matrices and for the channel with Kraus operators as in Eq. (19) of the main text correspond to the conditional probabilities pertaining to the bases and in Eq. (20), namely
[TABLE]
A straightforward calculation gives
[TABLE]
and
[TABLE]
where
[TABLE]
Notice that corresponds to a symmetric channel, i.e. all rows (and columns) are permutation of each other. Hence, there is no need of numerical optimisation over the prior probability , since the optimal is the uniform distribution. The corresponding maximal mutual information is given by the analytical expression . The maximisation of the mutual information pertaining to can be obtained by the Blahut-Arimoto algorithm in Eq. (6) of the main text, which provides the optimal prior probability and the corresponding value of . The detected capacity is then given by .
Detected capacity for qubit dephasing channels on an unknown basis
In the main text we noticed that when the measurement bases are badly matched with respect to the structure of the unknown channel, then the detected capacity may give a bound to the classical capacity which becomes looser. This is the price to pay if one has to avoid complete process tomography and the channel is completely unknown. We give here an example for the case of a dephasing channel for qubits with probability on an unknown basis. This channel can be written as
[TABLE]
where is the vector of Pauli operators , and is a unit vector on the Bloch sphere ( and ). The classical capacity is one bit and is clearly achieved by encoding on the eigenstates of which are noise-free.
Consider now the detection method with measurements of the Pauli matrices. The conditional probabilities of outcomes for the measurement of (with ) at the output of the dephasing channel in Eq. (13) with input state are given by
[TABLE]
Hence, these three conditional probabilities correspond to three classical binary symmetric channels, and the detected capacity is given by
[TABLE]
Notice that the detected capacity in Eq. (15) is invariant for . The effect of basis-mismatch can be detrimental, and in the present example, the worst-case scenario corresponds to and , for which .
In Fig. 3 we plot the detected capacity for versus the angles and .
Detected capacity for a Pauli channel followed by a phase
rotation
Let us consider a Pauli channel with unknown probabilities , , and , followed by a rotation around the -axis by an unknown angle , i.e.
[TABLE]
One can easily verify that our detection method based just on the measurements of , , and provides a significant detected capacity. The conditional probabilities of outcomes for the measurement of (with ) at the output of the channel with input state are given by
[TABLE]
and hence Eq. (12) of the main text is replaced by
[TABLE]
We notice that even if the structure of the channel were known, the measurement results would not allow to perform quantum process tomography, since the expectations and are not available. This can also be recognized by the fact that Eqs. (17) obtained by the measurements cannot be solved to determine the four variables , and .
When some prior information is available about the quantum channel, one can also quantify a sort of tradeoff between such information and the performance of our detection method. For example, let us formalize our uncertainty about by a von Mises probability density
[TABLE]
where denotes the modified [math]-order Bessel function, and is a parameter which measures the concentration (i.e., it is analogous of the reciprocal of variance for normal distributions). In other words, Eq. (19) encapsulates our knowledge that the most likely value of is zero, with increasing confidence for increasing values of (clearly, if the expected value of is different from zero, say , one could consequently change the set of measured observales to and achieve the same performance).
The expected detected capacity can then be evaluated by the weighted average of Eq. (18) with the function (19), namely
[TABLE]
In Fig. 4 we plot a specific result for channel parameters , , and , versus the concentration parameter . The average detected capacity grows from (corresponding to , i.e. a flat distribution and hence total ignorance on the phase ) to the theoretical classical capacity given by for increasing values of , i.e. of the available information on .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) M. A. Nielsen and I. L. Chuang, Quantum Information and Communication (Cambridge, Cambridge University Press, 2000).
- 2(2) A. S. Holevo, Prob. Inf. Transm. 9 , 177 (1973).
- 3(3) B. Schumacher and M. D. Westmoreland, Phys. Rev. A 56 , 131 (1997).
- 4(4) A. S. Holevo, IEEE. Trans. Inf. Theory 44 , 269 (1998).
- 5(5) H. Imai, M. Hachimori, M. Hamada, H. Kobayashi and K. Matsumoto, Proceedings of the 2nd Japanese-Hungarian Symposium on Discrete Mathematics and Its Applications, p. 60 (Budapest, 2001).
- 6(6) S. Osawa and H. Nagaoka, IEICE Trans. on Fundamentals of Electronics, Communications and Computer Sciences E 84 , 2583 (2001).
- 7(7) M. Hayashi, H. Imai, K. Matsumoto, M. B. Ruskai, and T. Shimono, Quantum Inf. Comput. 5 , 13 (2005).
- 8(8) T. Sutter, D. Sutter, P. M. Esfahani, and J. Lygeros, IEEE Trans. Inf. Theory 61 , 1649 (2016).
