Gaussian maximizers for quantum Gaussian observables and ensembles
A. S. Holevo

TL;DR
This paper proves that for multimode bosonic systems with gauge symmetry, the classical capacity of Gaussian observables and the accessible information of Gaussian ensembles are achieved by Gaussian states and measurements, extending known single-mode results.
Contribution
It establishes that Gaussian ensembles optimize classical capacity and accessible information in multimode bosonic systems, generalizing single-mode findings.
Findings
Classical capacity of Gaussian observables is attained on Gaussian ensembles.
Accessible information of Gaussian ensembles is achieved by multimode heterodyne measurement.
Results extend single-mode Gaussian optimization to multimode systems.
Abstract
In this paper we prove two results related to the Gaussian optimizers conjecture for multimode bosonic system with gauge symmetry. First, we argue that the classical capacity of a Gaussian observable is attained on a Gaussian ensemble of coherent states. This generalizes results previously known for heterodyne measurement in one mode. By using this fact and continuous variable version of ensemble-observable duality, we prove an old conjecture that accessible information of a Gaussian ensemble is attained on the multimode generalization of the heterodyne measurement.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Gaussian maximizers for quantum Gaussian observables and ensembles
A. S. Holevo
Steklov Mathematical Institute
Gubkina 8, 119991 Moscow, Russia
Abstract
In this paper we prove two results related to the Gaussian optimizers conjecture for multimode bosonic system with gauge symmetry. First, we argue that the classical capacity of a Gaussian observable is attained on a Gaussian ensemble of coherent states. This generalizes results previously known for heterodyne measurement in one mode. By using this fact and continuous variable version of ensemble-observable duality, we prove an old conjecture that accessible information of a Gaussian ensemble is attained on the multimode generalization of the heterodyne measurement.
1 Introduction
In this paper we prove two results related to the Gaussian optimizers conjecture for multimode bosonic system with global gauge symmetry111In quantum communication literature such systems are called phase-insensitive.. In theorem 1 of sec. 3 we argue that the classical capacity of an arbitrary gauge-covariant Gaussian observable – considered as a communication channel with quantum input and classical output – is attained on a Gaussian ensemble of coherent states. This generalizes result previously known for the heterodyne measurement [1]. In the difficult part of the argument – the minimization of the output differential entropy – we rely upon our previous result [2] obtained as a limiting case of the general solution of the Gaussian optimizers conjecture for quantum Gaussian channels [3]. Let us stress that it is not possible to apply that solution directly to a Gaussian observable because there is no way to embed a continuously-valued observable (as distinct from discretely-valued observables) into a channel with quantum output [4]. The classical capacity of observable is the most important quantity characterizing the ultimate information-processing performance of the measurement (see e.g. [1], [5], [4]).
By using theorem 1 and infinite-dimensional version of ensemble-observable duality developed in sec. 4, we prove the main result of this work – theorem 2 concerning accessible information of a Gaussian ensemble. In particular, it answers an old conjecture [6], [7], [8] that the accessible information of a Gaussian ensemble is attained by the multimode generalization of the heterodyne measurement. As in the other Gaussian optimizer problems, the difficulty here lies in finding the *global maximum * of a convex functional, when the optimal solution turns out to be highly non-unique and the standard tools of convex analysis become inefficient.
2 Preliminaries
Let be an observable (POVM) in a separable Hilbert space with the outcome space which is a complete separable metric space. A corresponding measurement channel is defined as transformation of density operators (d.o.) to probability distributions on . In [9] the existence of a finite measure was shown such that for any d.o. the probability measure is absolutely continuous w.r.t. thus having the probability density (p.d.) Therefore the * measurement channel* can be defined as the transformation
[TABLE]
mapping affinely d.o. on into p.d. on . Notice that is defined uniquely only up to the class of mutually absolutely continuous measures.
A (generalized) ensemble consists of probability measure on the input space and a measurable family of d.o. on . The average state of the ensemble is the barycenter of this measure
[TABLE]
the integral existing in the strong sense in the Banach space of trace class operators. Let be an observable with the outcome space and the corresponding measurement channel. The joint probability distribution of on is uniquely defined by the relation
[TABLE]
where is an arbitrary Borel subset of and is that of
The classical Shannon information between is equal to (cf. [10])
[TABLE]
We will use the differential entropy
[TABLE]
of a p.d. There is a special class of p.d.’s we will be using for which the differential entropy is well-defined. Let be a -dimensional vector space and a bounded p.d. on such that (mod ) for some . Then is well-defined with values in because in this case is a p.d. satisfying (mod ), hence . Thus is well-defined with values in and by change of variable ,
[TABLE]
is also well-defined with values in .
If observable is such that for any d.o. and ensemble is such that , then the Shannon information between is equal to
[TABLE]
This quantity is well-defined with values in due to Jensen’s inequality.
In what follows will be the space of a strongly continuous irreducible projective unitary representation of the canonical commutation relations (CCR) (see e.g. [11], [12] for a detailed account) describing quantization of a linear classical system with degrees of freedom such as finite number of physically relevant electromagnetic modes in a receiver’s cavity.
The classical linear system with the preferred complex structure (gauge) is described by the phase space equipped with the symplectic form where
[TABLE]
We will use the symplectic Fourier transform
[TABLE]
Notice that i.e. inverse transform has the same form.
The quantization gives a bosonic system described by the collection of annihilation-creation operators, in the vector form
[TABLE]
where the lower index of a component refers to the number of the mode. The CCR including the nonvanishing commutator
[TABLE]
are conveniently written in terms of displacement operators namely
[TABLE]
The (global) gauge group acts as ( is real phase) in the classical space, and via the unitary group in ( is the total number operator), so that
[TABLE]
The quantum Fourier transform of a trace class operator is defined as
[TABLE]
The quantum Parceval formula holds:
[TABLE]
An operator is gauge-invariant if for all values of the phase . A gauge-invariant Gaussian d.o. is defined by the quantum characteristic function222We denote the unit -matrix, as distinct from the unit operator in a Hilbert space.
[TABLE]
where is the complex covariance matrix satisfying . Notice that where ⊤ denotes the transposition operation defined in (54) (see Appendix A).
In the Hilbert space of an irreducible representation of CCR there is a unique unit vacuum vector such that The case in (6) corresponds to the vacuum d.o. The coherent state vectors are
We will use the P-representation in the case of nondegenerate
[TABLE]
Another important Gaussian d.o. is obtained by action of the displacement operators
[TABLE]
it has the quantum characteristic function
[TABLE]
3 Gaussian observables and ensembles
In this Section we will consider Gaussian observables with the outcome space described by POVM
[TABLE]
where a nondegenerate complex matrix, and d.o. is defined by (6) with This is a special (gauge-covariant) case of general Gaussian observables considered in [12]. Particularly important is the case where
[TABLE]
In the Appendix A we recall alternative description of such observables via extension to a spectral measure in a composite system including ancillary system (going back to [6]). By taking so that is the vacuum state, we obtain the multimode version of the “heterodyne measurement”
[TABLE]
see [13]. Thus the POVM (11) corresponds to a noisy generalization of the multimode heterodyne measurement.
Let be an input d.o. then by using (5), (9) and real-valuedness of the quadratic form under the exponent, the output p.d. of the observable (11) is
[TABLE]
and that of observable (10) is Notice that all these p.d.’s belong to the class because for any two d.o. . Thus the differential entropy of the output p.d. is well-defined and
[TABLE]
Let be a nonnegative definite complex Hermitian matrix. By we denote the set of all d.o. with the complex covariance matrix
[TABLE]
There is a unique gauge-invariant Gaussian d.o. in . By using (6) with and (3), one obtains the output p.d.
[TABLE]
which is complex Gaussian p.d. with the covariance matrix
We have
[TABLE]
Indeed, for any define the gauge-invariant d.o.
[TABLE]
By the concavity of the differential entropy and Jensen’s inequality By using (4) it is not difficult to check that \rho_{gi}\has zero first moments, finite second moments given by (15) and other second moments such as vanishing. Thus it has all the first and the second moments the same as the Gaussian d.o. By the classical maximum entropy principle [16], we have , which proves (17).
We will be interested in the following constrained capacity of the channel
[TABLE]
where is the classical information quantity defined in (2).
Theorem 1
Let be the measurement channel corresponding to the Gaussian observable (10), then the supremum in (18) is equal to
[TABLE]
and is attained on the Gaussian ensemble of coherent states
[TABLE]
The relation (19) can be considered as a multimode version of a formula obtained in [1] by “information exclusion” argument.
Proof. The channel defined by (11) is covariant with respect to the irreducible action of the displacement operators which means
[TABLE]
for any Borel subset or, equivalently,
[TABLE]
By adapting the argument from [14] for irreducibly covariant quantum channel to our case of quantum-classical channel, we obtain
[TABLE]
where
[TABLE]
is the minimal output differential entropy. Indeed, from (2) it follows that the right-hand side (with ‘sup’ in place of ‘max’ and ‘inf’ in place of ‘min’) is an upper bound for . Its achievability follows from Proposition 2 of the recent paper [15], since all the assumptions of that result are fulfilled in the Gaussian gauge-invariant case under consideration here.
To be explicit, first, by a proof of generalization of the Wehrl conjecture to the measurements of the form (11) obtained in [2], the minimum (22) is attained on the vacuum state to which corresponds the output p.d.
[TABLE]
so that
[TABLE]
Second, take an ensemble such that Then has the covariance matrix (see Appendix A). By the maximum entropy principle (17), the maximum of is attained on the Gaussian d.o. and is equal to
[TABLE]
And finally, by (7) the d.o. is the average state of the Gaussian ensemble of coherent states obtained from the vacuum state by the action of the displacement operators , which is thus the optimal ensemble achieving the upper bound for (18).
Thus we obtain the value
[TABLE]
By taking into account (14) we also obtain that
[TABLE]
i.e. rescaling the observable by nondegenerate has no effect on the capacity of the measurement (11) and the optimal ensemble.
The importance of the quantity (18) is apparent: it is a key for computing the energy-constrained classical capacity of the channel which is important quantity characterizing the information-processing performance of the measurement. Indeed, let
[TABLE]
be a quadratic gauge-invariant Hamiltonian, where is positive definite Hermitian matrix, so that the mean energy of the input d.o. is equal to
[TABLE]
where denotes trace of matrices as distinct from the trace of operators. Then the energy constraint has the form where is a positive number, and the energy-constrained classical capacity of the channel is equal to
[TABLE]
Notice that the additivity issue does not arise here because measurement channels are entanglement breaking [9], [12]. Given an explicit expression for such as (19), computation of the last supremum is a separate optimization problem which can be solved analytically in some special cases. For example, if so that is diagonal, and then the optimal is also diagonal and its entries can be found with a simple generalization of the “water-filling solution”, cf. [16], namely
[TABLE]
where is found from the equation and
[TABLE]
The following result (for observable (12)) was conjectured in the early seventies. In [6] it was observed that the measurement (12) for the Gaussian ensemble (25), (26) below gives the information amount (24) which is thus the lower bound for the accessible information of the ensemble defined as
[TABLE]
where the supremum is over all observables . The conjecture was that the observable (12) gives the global maximum. In [7] the authors verified the necessary local extremality condition for information based on the first variation derived in [17], and in [8] the second variation was shown nonpositive333The English versions of these articles were also posted as arXiv:quant-ph/0511042, arXiv:quant-ph/0511043.. However to our knowledge the question of the global maximum was open until now.
Theorem 2
Let be the Gaussian ensemble where 444For the clarity of proofs we assume that the covariance matrices are nondegenerate, although this restriction can be relaxed by using more abstract computations with the quantum characteristic functions.
[TABLE]
is d.o. (8) with
Then the accessible information of this ensemble is equal to (24) and is attained on any Gaussian observable of the form
[TABLE]
where in particular, on the observable (12).
Proof. By using (8) and convolution of Gaussian densities, we obtain the average state of the ensemble (25), (26)
[TABLE]
Computation using (2) and (23) gives
[TABLE]
for the ensemble and observable defined by (12), thus giving the lower bound for the accessible information Any observable (27) gives the same value by (14).
We now use the general upper bound from the next section:
[TABLE]
where is observable dual to the ensemble (defined by Eq. (41) in proposition 3 below), and the supremum is taken over all ensembles satisfying the condition . In the case we are considering, this observable will turn out Gaussian so that we can apply to it theorem 1 to compute the right-hand side of the inequality (30). According to Eq. (41) the dual observable is given by the relation
[TABLE]
By using the decomposition in the normal modes associated with the orthonormal basis of eigenvectors of the matrix
[TABLE]
we obtain (see (58) in Appendix A for detail)
[TABLE]
Substituting this into (3), we get
[TABLE]
By making change of variables
[TABLE]
and denoting
[TABLE]
we obtain, by arranging the terms in the quadratic form under the exponent in (34),
[TABLE]
which has the same Gaussian form as in theorem 1.
We now compute the supremum in the right-hand side of (30) by using theorem 1 with replaced by and replaced by from the average state given by (28). Theorem 1 then implies
[TABLE]
A computation below shows that
[TABLE]
where This gives the upper estimate for which coincides with the lower estimate (3), thus proving the theorem.
To prove (37), we obtain from (35)
[TABLE]
then
[TABLE]
Substituting
[TABLE]
into (38), we obtain
[TABLE]
where hence (37) follows.
4 Ensemble-observable duality
Duality between ensembles and observables proved to be an efficient tool in quantum information theory (see [1], [5], [18] or [4]). In this section we provide a rigorous infinite-dimensional and continuous-variables version of this duality used in the proof of theorem 2.
Proposition 3
Let be an ensemble and an observable such that
[TABLE]
where is a finite measure, is weakly measurable function with values in the cone of bounded positive operators in and the integral weakly converges ( is an arbitrary Borel subset of ).
Define the dual pair ensemble-observable by the relations
[TABLE]
[TABLE]
for , where 555We use the generalized inverse for .. Then the average states of both ensembles coincide
[TABLE]
Moreover, the joint distribution of is the same for both pairs and so that
[TABLE]
Proof. From (40) it follows
[TABLE]
The definition (41) implies
[TABLE]
for dense domain of implying that are bounded positive operators with . The definition via integral also implies -additivity, hence is an observable.
Notice the identity
[TABLE]
Then the joint distribution of
[TABLE]
via (44) is equal to
[TABLE]
hence (43) holds.
The equality (43) implies an estimate for the accessible information of the ensemble
[TABLE]
where the supremum is over all observables .
Proposition 4
Let be a fixed ensemble and be the dual observable, then
[TABLE]
where the supremum in the right-hand side is taken over all ensembles satisfying the condition .
Proof. We first prove the inequality (30) which was used in the proof of theorem 2. We repeat it here for convenience:
[TABLE]
For this it is sufficient to show that
[TABLE]
where on the right the supremum is taken over observables which satisfy (39) with respect to some measure . Then by using the proposition 3 we obtain
[TABLE]
where in the right-hand side the supremum is taken over ensembles that can be written in the form for suitable , whence (46) will follow.
Proof of the equality (47) is based on two facts. First, we show that any observable can be approximated by a sequence of observables satisfying (39) for some measures . Second, we observe that the information quantity is lower semicontinuous in this approximation.
Let be an observable, and let be a nondecreasing sequence of projections in such that as Define the measure and the sequence of observables
[TABLE]
Then satisfies (39) with the measure Indeed, and the second term in the direct sum (48) is dominated by Hence, by an operator version of Radon-Nikodym theorem, with (mod ).
For arbitrary d.o. and arbitrary Borel
[TABLE]
as
Now let be a finite decomposition of the space into Borel subsets Define the finitely valued “coarse-grained” observable A general result of classical information theory (cf. [10]) implies
[TABLE]
where the supremum is taken over all the decompositions . We will prove that for a fixed the functional is continuous with respect to the approximation (4), then it will follow that is lower semicontinuous. Denoting we have
[TABLE]
When we approximate by the first term converges by (4) and by continuity of the Shannon entropy. In the second term the integrand converges pointwise by (4) and it is uniformly bounded because for This finishes the proof of (47) and hence of (46).
Let us now prove the stronger result: the equality (45), by showing that any ensemble with fixed average state can be approximated by ensembles of the form (40). First, if has finite rank, the problem reduces to finite dimensional one which is easily solved. Therefore assume that the rank of is infinite (for simplicity we can assume that is nondegenerated). Let be the projection onto the eigenspace of corresponding to largest eigenvalues. Let
[TABLE]
then
[TABLE]
where is the smallest eigenvalue for eigenvectors in the range of Then
[TABLE]
hence is an observable. Moreover, (mod ). Define ensemble by taking and
[TABLE]
then by (50) the average state of is Ensemble has the required form (40). Moreover, for any observable the joint probability
[TABLE]
Indeed,
[TABLE]
pointwise, remaining uniformly bounded by 1. For any finite decomposition of the space and of the space , the “coarse-grained” mutual information is continuous and the mutual information is lower semicontinuous by the argument in the proof above, hence
[TABLE]
It implies finally the equality (45).
5 Conclusion
We have considered quantum Gaussian multimode system with the global gauge symmetry and obtained explicit formula for the classical capacity of a Gaussian observable which describes statistics of a noisy heterodyne measurement in such a system. We have shown that the capacity is attained by a Gaussian ensemble of coherent states. The condition of gauge covariance was relaxed in our recent paper [15], where the formula was generalized to the case where only certain “threshold condition” is fulfilled. Our second result gives explicit expression for the accessible information of a gauge-invariant Gaussian ensemble, and shows that it is attained by the multimode generalization of the (ideal) heterodyne measurement, solving a conjecture going back to the seventies. Moreover, the same value is attained by any multimode scaling of the measurement, illustrating the high degeneracy of the maximum characteristic to such kind of “quantum Gaussian optimizer” problems. A natural question of extensions of this result to quantum Gaussian systems without gauge symmetry, or even without any “threshold condition” remains open for investigation.
6 Appendix A
Let be two d.o., then, generalizing (3), the relation
[TABLE]
defines a p.d. on Its classical characteristic function expressed via the symplectic Fourier transform is
[TABLE]
where in (51) we used (3) and the Parceval identity (5), and in (52) the transposition is defined by the relation
[TABLE]
The expression (53) can be rewritten as
[TABLE]
where act in is the Hilbert space of the ancillary system. The vectors have the components
[TABLE]
where are annihilation-creation operators in These components are commuting normal operators, so that they have joint probability distribution with the classical characteristic function (55). Assuming that let us find the complex covariance matrix of this distribution. It has the components
[TABLE]
Thus the complex covariance matrix is
[TABLE]
Let be an orthonormal basis in and let be a decomposition of the vector in this basis. Then where are the new creation operators, corresponding to the modes associated with the basis Let be the eigenvector of the th mode number operator , corresponding to the eigenvalue Then one has tensor product decomposition of a coherent state vector
[TABLE]
If is the basis of eigenvectors of the covariance matrix of the Gaussian d.o. with the corresponding eigenvalues then
[TABLE]
It follows that is given by the expression
[TABLE]
[TABLE]
The formula (33) is obtained by choosing the basis of eigenvectors of the covariance matrix and then using this expression.
Acknowledgment. The work was supported by the grant of Russian Scientific Foundation (project No 19-11-00086). The author is grateful to M.E. Shirokov, G.G. Amosov, S.N. Filippov and anonymous referees for useful remarks.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. J. W. Hall, “Quantum information and correlation bounds,” Phys. Rev. A vol. 55, pp. 1050-2947, 1997.
- 2[2] V. Giovannetti, A. S. Holevo, A. Mari, “Majorization and additivity for multimode bosonic Gaussian channels,” Theor. Math. Phys. , vol. 182, pp. 284–293, 2015.
- 3[3] V. Giovannetti, A. S. Holevo, R. Garcia-Patron, “A Solution of Gaussian Optimizer Conjecture for Quantum Channels,” Commun. Math. Phys. vol. 334, pp. 1553-1571, 2015.
- 4[4] A. S. Holevo, “Information capacity of quantum observable,” Probl. Inform. Transmission vol. 48, pp. 1-10, 2012.
- 5[5] M. Dall’Arno, G. M. D’Ariano, M.F. Sacchi, “Informational power of quantum measurements,” Phys. Rev. A , vol. 83, 062304, 2011.
- 6[6] A. S. Holevo, “On the Mathematical Theory of Quantum Communication Channels, Probl. Inform. Transmission ,” vol. 8, pp. 47-54, 1972.
- 7[7] V. P. Belavkin, R. L. Stratonovich, “Optimization of Quantum Information Processing Maximizing Mutual Information,” Radio Eng. Electron. Phys. , vol. 19, p. 1349, 1973. [trans. from Radiotekhnika i Electronika , vol. 19, pp. 1839-1844, 1973.
- 8[8] V. P. Belavkin, A. G. Vantsyan, “On the Sufficient Optimality Condition for Quantum Information Processing,” Radio Eng. Electron. Phys. , vol. 19, p. 39, 1974. [trans. from Radiotekhnika i Electronika , vol. 19, pp. 1391–1395, 1974.
