The capacity of coherent-state adaptive decoders with interferometry and single-mode detectors
Matteo Rosati, Andrea Mari, Vittorio Giovannetti

TL;DR
This paper investigates the limits of adaptive decoders using interferometry and single-mode detectors for coherent states, showing they cannot surpass the capacity of single-mode decoders in certain quantum channels.
Contribution
It demonstrates that a broad class of adaptive decoders cannot outperform single-mode decoders in classical communication over quantum phase-insensitive Gaussian channels.
Findings
Optimal information rate of ADs is not greater than single-mode decoders.
Ultimate capacity of these channels is unlikely to be achieved with the considered ADs.
Adaptive procedures based on passive multi-mode Gaussian unitaries do not improve capacity.
Abstract
A class of Adaptive Decoders (AD's) for coherent-state sequences is studied, including in particular the most common technology for optical-signal processing, e.g., interferometers, coherent displacements and photon-counting detectors. More generally we consider AD's comprising adaptive procedures based on passive multi-mode Gaussian unitaries and arbitrary single-mode destructive measurements. For classical communication on quantum phase-insensitive Gaussian channels with a coherent-state encoding, we show that the AD's optimal information transmission rate is not greater than that of a single-mode decoder. Our result also implies that the ultimate classical capacity of quantum phase-insensitive Gaussian channels is unlikely to be achieved with the considered class of AD's.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
The capacity of coherent-state adaptive decoders
with interferometry and single-mode detectors
Matteo Rosati
NEST, Scuola Normale Superiore and Istituto Nanoscienze-CNR, I-56127 Pisa, Italy.
Andrea Mari
NEST, Scuola Normale Superiore and Istituto Nanoscienze-CNR, I-56127 Pisa, Italy.
Vittorio Giovannetti
NEST, Scuola Normale Superiore and Istituto Nanoscienze-CNR, I-56127 Pisa, Italy.
Abstract
A class of Adaptive Decoders (AD’s) for coherent-state sequences is studied, including in particular the most common technology for optical-signal processing, e.g., interferometers, coherent displacements and photon-counting detectors. More generally we consider AD’s comprising adaptive procedures based on passive multi-mode Gaussian unitaries and arbitrary single-mode destructive measurements. For classical communication on quantum phase-insensitive Gaussian channels with a coherent-state encoding, we show that the AD’s optimal information transmission rate is not greater than that of a single-mode decoder. Our result also implies that the ultimate classical capacity of quantum phase-insensitive Gaussian channels is unlikely to be achieved with the considered class of AD’s.
I Introduction
Quantum communication theory is a promising field for the application of quantum technology, since its predictions could be applied in the short-term in several settings of practical relevance. An important example is communication on free-space or optical-fiber links, which are well described theoretically by quantum phase-insensitive Gaussian channels HolevoBOOK ; HolGiovaRev ; CAVES , e.g., the lossy bosonic channel EXACT .
The maximum transmission rate of classical information on a quantum channel, known as its capacity, is provided by the Holevo-Schumacher-Westmoreland (HSW) theorem schumawest ; holevo1 ; holevo2 ; holevo3 ; winter . In particular for quantum phase-insensitive Gaussian channels the capacity at constrained average input energy can be achieved gaussOpt ; gaussOpt2 ; maj1 ; maj2 by a simple separable encoding, i.e., sending sequences of coherent states RevGauss , each of them constituting a letter for a single use of the channel or communication mode. This fact may seem surprising at first, since coherent states are among the simplest states of the electromagnetic field and are often regarded as fundamentally classical. Nevertheless they are sufficient to achieve the maximum communication rate allowed by quantum mechanics on a broad class of channels of considerable practical relevance. Unfortunately the truly quantum challenge posed by these systems seems to reside in the decoding procedures, since all known capacity-achieving measurements require joint decoding operations hauswoot ; schumawest ; winter ; oga ; oganaga ; hayanaga ; hayashi ; seq1 ; seq2 ; sen ; Arikan ; wildeHayden ; polarWildeGuha ; NOSTRO , i.e., reading out entire blocks of letters at once by projecting onto arbitrary entangled superpositions of the codewords. Hence even the classical coherent-state encoding requires a highly non-trivial quantum decoding to achieve capacity. Such joint quantum measurements are difficult to design with current technology wildeguha1 ; wildeguha2 ; takeokaGuha ; takeokaGuha2 ; lee ; Guha1 ; Guha2 ; Banaszek ; NOSTROHad , so that the quest for an optimal decoder of separable coherent-state codewords that would finally trigger practical applications is still open. Given the difficulty of implementing truly joint quantum measurements, research has then mainly focused on decoding coherent states with the general class of Adaptive Decoders (AD) depicted in Fig. 1a. The latter combines the available single-mode technology, e.g., photodetectors and local transformations, with multi-mode passive interferometers and classical feedforward control. The rationale behind this choice is that introducing correlations between modes during the decoding procedure may increase the transmission rate of simple separable measurements, getting closer to the structure of joint quantum measurements that seems to be ultimately necessary to achieve the capacity of phase-insensitive Gaussian channels.
On the contrary, in this Letter we prove that the maximum information transmission rate of such channels with coherent-state encoding and AD is equal to that obtained with a Separable Decoder (SD) employing the same measurement on each mode, as shown in Fig. 1b. The general idea behind our proof is to map the quantum AD into an effective classical programmable channel with feedback to the encoder. Then we obtain our results by extending Shannon’s feedback theorem ShanFeed ; covThom to this kind of channels.
Our work gives several major contributions: i) it implies the conjecture by Chung et al. guhaDet ; guhaDet1 , namely that adaptive passive Gaussian interactions, single-mode displacements and photodetectors do not increase the optimal transmission rate; ii) if the HSW capacity of phase-insensitive Gaussian channels is achieved only by joint measurements, as the evidence suggests so far, then it cannot be achieved with our AD scheme; iii) it extends the results of Takeoka and Guha takeokaGuha2 , who considered only Gaussian measurements; iv) it extends the analysis made by Shor ShorAd in the context of trine states to coherent states and passive interactions. Our results, though already envisaged in previous works on the subject, have strong relevance for future research on practical decoders: i) they extend the study of decoders by considering arbitrary single-mode manipulations before measurement, including non-Gaussian and non-unitary ones; ii) they exclude a decoding advantage of adaptive passive Gaussian interactions, which are the easiest to realize in practice, suggesting that more difficult interactions are necessary to achieve capacity. Furthermore the possibility of employing ancillary states is partially included in our AD scheme: this is the case if each ancilla is allowed to interact just with one mode before being measured; otherwise, i.e., if the ancillae can interact with several modes, the problem of determining the decoder’s optimal rate remains open and could give a practical advantage over SD’s nota .
The article is structured as follows: in Sec. II we describe in detail the communication protocol and the class of decoders considered; in Sec. III we demonstrate that the AD’s optimal rate is equal to the SD’s one; in Sec. IV we discuss implications and draw our conclusions.
II The adaptive decoder
Let us suppose that the sender, Alice, wants to transmit a classical message on independent communication modes, employing coherent states of the electromagnetic field. The latter are defined in terms of the field’s annihilation and creation operators , as displaced vacuum-states of phase-space amplitude , i.e., , with the displacement operator. The messages, represented by the sequence of classical input random variables with letters for each , are encoded into a separable sequence of optical coherent states , one for each mode , where we have used the compact notation , , to indicate a sequence of quantities on different modes, from the -th to the -th one. Each message is chosen according to a joint probability distribution at constrained average input energy per mode , i.e.,
[TABLE]
Let us also suppose that the transmission medium is well described by a quantum phase-insensitive Gaussian channel, represented by a linear Completely Positive and Trace Preserving (CPTP) map on the Hilbert space of a single mode and completely defined by its action on the displacement operator, i.e.,
[TABLE]
in terms of two parameters satisfying the constraint HolevoBOOK . As shown in gaussOpt ; gaussOpt2 ; maj1 ; maj2 , the separable coherent-state encoding discussed above achieves the classical capacity of , when its probability distribution is i.i.d. and Gaussian on each mode.
The receiver, Bob, has an AD that outputs the sequence of classical random variables , where for all modes and is the set of possible single-mode outcomes, which can be discrete or continuous, e.g., for homodyne detection. The probability distribution of the output variables can be computed from the conditional probability of obtaining an outcome sequence if the input sequence was sent, i.e., . The latter is determined by the specific decoding operations of the AD, Fig. 1a, comprising for all :
- •
a multi-mode passive Gaussian unitary , i.e., a network of beam-splitters and phase-shifters conditioned on the outcomes of previous measurements, acting on the set of modes from the -th to the -th as
[TABLE]
where is the -dimensional unitary matrix representing in phase-space, applied directly to as a phase-space vector;
- •
single-mode operations and a final destructive measurement, altogether represented by a local Positive Operator-Valued Measure (POVM) chosen among a set of possible POVM’s that are labeled by the (discrete or continuous) index conditioned on the outcomes of previous modes. Each POVM is defined by a collection of positive operators corresponding to the possible single-mode outcomes,
[TABLE]
where the operators sum up to the identity on the Hilbert space of a single mode.
For our results to hold, a crucial assumption is that the single-mode POVM’s completely destroy the measured state before any information is sent to the rest of the system; if instead Bob can perform partial measurements the AD’s rate may increase, see ShorAd . Let us also note that the generic set of allowed POVM’s described above can be restricted case by case by properly choosing the . For example the simplest toolbox for optical-signal processing is that of the Kennedy receiver Ken with POVM’s of the form
[TABLE]
where the index is the amplitude of a phase-space displacement in this case. Since the latter depends adaptively on previous outcomes, the AD with a single-mode Kennedy structure behaves similarly to a Dolinar receiver Dol .
III The optimal rate
The performance of a quantum decoder for the transmission of classical information can be evaluated by computing the mutual information of its classical input and output random variables. The latter is defined for our AD as
[TABLE]
i.e., the difference of the Shannon entropy covThom of and the Shannon conditional entropy of given . The AD’s optimal information transmission rate then is obtained by maximizing the mutual information (7) over the input distribution with energy constraint and the decoding operations and regularizing it as a function of the number of uses , i.e.,
[TABLE]
We want to compare the AD with the SD of Fig. 1b, comprising for each use of the channel only a single-mode POVM chosen from the same set of those in the AD parametrized by , Eq. (4), but without any interaction or classical communication between modes. Obviously, the optimal rate of this SD is obtained by maximizing the mutual information of the single-mode input and output variables and over the input distribution at constrained energy and the POVM’s parameter, i.e.,
[TABLE]
In order to show that the optimization (8) reduces to (9), we find it useful to consider a more general decoder comprising the AD and a classical feedback link from Bob to Alice, that certainly cannot decrease the optimal rate (8). Exploiting this feedback and the phase-insensitive property of , Alice can always perform the instead of Bob. Hence all the AD’s interactions are represented by a classical feedback to the encoder, that rearranges the remaining sequences into new sequences with for all modes , before transmission on the channel. Crucially, each choice of corresponds to a different rearrangement performed by the encoder in such a way that the total average-energy constraint (1) is still respected by the joint probability distribution of the new messages .
As a function of the encoded variables , the rest of the AD scheme can be rewritten as a single-mode classical programmable channel, i.e., a channel with memory that can be chosen adaptively depending on previous outcomes. The corresponding conditional probability at the -th use then is
[TABLE]
where are the elements of the POVM as in Eq. (4).
In light of the previous observations we can conclude that the AD of Fig. 1a, with additional classical communication from Bob to Alice, is equivalent to the classical programmable channel (III) with feedback, as shown in Fig. 2. Hence the AD’s optimal rate, Eq. (8), is upper bounded by the feedback capacity of (III). Similarly, the capacity of the programmable channel without feedback for a single use is equal to the SD’s optimal rate, Eq. (9). Eventually, the two classical capacities just defined are related via the following theorem, which is a generalization of Shannon’s feedback theorem ShanFeed ; covThom to the class of programmable channels considered:
Theorem 1**.**
The feedback capacity of a classical programmable channel is equal to its capacity without feedback and it is additive.
Proof.
Suppose we employ the channel to transmit a classical message with probability distribution , outputting for each use ; the most general technique allows a feedback to the sender, who encodes the input message into a sequence of letters through an encoding function for each use . If represents the complex amplitude of a signal we must impose a total average-energy constraint as in Eq. (1). The feedback capacity of this classical programmable channel at constrained total average-energy per mode is obtained by maximizing the mutual information over the input distribution, the encoding functions and the programmable parameters for each use:
[TABLE]
Similarly, for independent uses of the channel without feedback, the capacity at constrained average-energy can be defined as
[TABLE]
Now let us note that , since among all adaptive schemes involved in the optimization (11) there is one which employs no feedback and the same single-mode measurements that are optimal for Eq. (12). To prove the opposite consider the following:
[TABLE]
where the first equality follows form the chain rule of mutual information and the fact that conditioning over and is equivalent to conditioning over and thanks to the encoding functions, i.e., H\left(Y_{j}\big{|}W,Y_{(1,j-1)}\right)=H\left(Y_{j}\big{|}B_{j},Y_{(1,j-1)}\right). The first inequality instead is obtained by employing the definition of Eq. (12) as an upper bound on each mutual information term in the sum and writing explicitly the average over the output distribution; the last inequality follows from concavity of the classical capacity as a function of the energy and the total average-energy per mode constraint, i.e., . Eventually by plugging Eq. (13) into the definition (11) we obtain the upper bound . ∎
This implies that the AD’s optimal rate is not greater than the SD’s one. Since the former is certainly not smaller than the latter, we conclude .
IV Implications and conclusions
Our analysis implies that a broad class of adaptive decoders for coherent communication on phase-insensitive Gaussian channels, including a majority of those most easily realizable with current technology, cannot beat the optimal single-mode-measurement rate of information transmission. This in turn seems to suggest that such decoders cannot achieve the HSW capacity of phase-insensitive Gaussian channels; however there is no actual proof that joint decoders are really necessary for the task, so that this possibility remains open. In any case our result does not mean that block-coding techniques and adaptive receivers are completely useless for practical applications; indeed in general there may exist specific AD schemes that are more convenient to implement than SD ones and perform equally well, e.g., see Hadamard codes Guha1 ; Guha2 ; Banaszek ; NOSTROHad .
Let us also note that, despite our result is very powerful in decoupling the AD’s multi-mode structure for any kind of single-mode POVM, still the difficult optimization of the SD rate of Eq. (9) is left if one wants an explicit expression of the rate for any set of POVM’s. For example we can simplify this calculation for the set of single-mode receivers comprising a coherent displacement followed by any other kind of single-mode operation (the Kennedy receiver of Eq. (5) belongs to this set). Indeed let us define the variance of a single-mode input probability distribution over coherent states as ; the energy is instead . One can decide to put a constraint either on the energy or on the variance of the input signals and the former is stricter than the latter. It can then be shown that the net effect of the displacement in a coherent-state receiver is simply to enlarge the family of allowed input distributions from the energy- to the variance-constrained ones so that the optimal rate (9) can be computed on a shrunken set of allowed POVM’s.
A particularly useful kind of single-mode receivers is that of Kennedy, defined by Eqs. (5,6), employing a coherent displacement and an on-off photodetector. The SD’s optimal rate for this receiver has been computed in the low-energy limit in guhaDet ; guhaDet1 , showing that it equals
[TABLE]
Moreover the same authors have shown that an AD scheme without unitaries has the same optimal rate and conjectured that also adaptive unitaries do not help. Our result exactly implies the validity of this conjecture for the particular choice of POVM’s (5,6).
Eventually our result intersects with those of takeokaGuha2 ; ShorAd , expanding the set of adaptive receivers whose optimal rate is equal to that of separable ones. Indeed takeokaGuha2 compute the capacity of coherent communication with arbitrary adaptive Gaussian measurements, showing it is separable; here instead we considered a restricted interaction set, i.e., passive Gaussians, but an extended single-mode measurement one, i.e., arbitrary POVM’s. As for ShorAd , it is stated there that adaptive schemes based on partial single-mode measurements of all the modes may increase the optimal rate; here we considered only destructive single-mode measurements but included the simplest kind of interactions and still could not surpass separable decoding rates. In particular, as stated in Sec. I, our AD includes the use of ancillary systems if they interact with just one of the received modes, since it can be thought of as a part of the single-mode destructive measurements. Unfortunately the interaction of ancillary systems with multiple modes is not included, since it results in non-destructive measurements that could provide an advantage over SD’s. Future lines of research could be: studying the lesser-known, interesting class of non-destructive adaptive decoders, computing explicitly the optimal rate for other classes of POVM’s, exploring the potential of squeezing and non-Gaussian interactions.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) C. M. Caves and P. D. Drummond, Rev. Mod. Phys. 66 , 481 (1994).
- 2(2) A. S. Holevo, Quantum Systems, Channels, Information (de Gruyter Studies in Mathematical Physics, 2012).
- 3(3) A. S. Holevo and V. Giovannetti, Rep. Prog. Phys. 75 , 046001 (2012).
- 4(4) V. Giovannetti, S. Guha, S. Lloyd, L. Maccone, J. H. Shapiro and H. P. Yuen, Phys. Rev. Lett. 92 , 027902 (2004).
- 5(5) A. S. Holevo, Probl. Peredachi Inf. 9 , 3 (1973); Probl. Inf. Transm. (Engl. Transl.) 9 , 110 (1973).
- 6(6) A. S. Holevo, IEEE Trans. Inf. Theory 44 , 269 (1998).
- 7(7) A. S. Holevo, e-print ar Xiv:quant-ph/9809023 [see also Tamagawa University Research Review, no. 4] (1998).
- 8(8) B. Schumacher and M. D. Westmoreland, Phys. Rev. A 56 , 131 (1997); P. Hausladen, R. Jozsa, B. W. Schumacher, M. Westmoreland, and W. K. Wootters, ibid. 54 , 1869 (1996).
