A Proof of Vivo-Pato-Oshanin's Conjecture on the Fluctuation of von Neumann Entropy
Lu Wei

TL;DR
This paper provides a rigorous proof for a conjecture regarding the variance of von Neumann entropy in bipartite quantum systems, confirming a specific formula involving special functions.
Contribution
The paper offers the first complete proof of Vivo, Pato, and Oshanin's conjecture on the fluctuation of von Neumann entropy for quantum subsystems.
Findings
Confirmed the conjectured variance formula for von Neumann entropy
Validated the specific mathematical expression involving trigamma functions
Contributed to the theoretical understanding of quantum entropy fluctuations
Abstract
It was recently conjectured by Vivo, Pato, and Oshanin [Phys. Rev. E 93, 052106 (2016)] that for a quantum system of Hilbert dimension in a pure state, the variance of the von Neumann entropy of a subsystem of dimension is given by \begin{equation*} -\psi_{1}\left(mn+1\right)+\frac{m+n}{mn+1}\psi_{1}\left(n\right)-\frac{(m+1)(m+2n+1)}{4n^{2}(mn+1)}, \end{equation*} where is the trigamma function. We give a proof of this formula.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A Proof of Vivo-Pato-Oshanin’s Conjecture
on the Fluctuation of von Neumann Entropy
Lu Wei
Department of Electrical and Computer Engineering
University of Michigan-Dearborn, MI 48128, USA
Abstract
It was recently conjectured by Vivo, Pato, and Oshanin [Phys. Rev. E , 052106 (2016)] that for a quantum system of Hilbert dimension in a pure state, the variance of the von Neumann entropy of a subsystem of dimension is given by
[TABLE]
where is the trigamma function. We give a proof of this formula.
††preprint: APS/123-QED
I Background and the Conjecture
Consider a composite quantum system that consists of two subsystems and of Hilbert space dimensions and . The Hilbert space of the composite system is given by the tensor product of the Hilbert spaces of the subsystems, . The random pure state of the composite system is written as a linear combination of the random coefficients and the complete basis and of and , . The corresponding density matrix has the natural constraint . This implies that the random coefficient matrix satisfies
[TABLE]
Without loss of generality, it is assumed that . The reduced density matrix of the smaller subsystem admits the Schmidt decomposition , where is the -th largest eigenvalue of . The conservation of probability (1) now implies the constraint The probability measure of the random coefficient matrix is the Haar measure, where the entries are uniformly distributed over all the possible values satisfying the constraint (1). The resulting eigenvalue density of is well known (see, e.g., Page (1993)),
[TABLE]
where is the Dirac delta function and the constant
[TABLE]
The random matrix ensemble (2) is also known as the (unitary) fixed-trace ensemble. The considered bipartite quantum system is a fundamental model that describes the interaction between physical object and its environment. For example Page (1993), the subsystem is the black hole and the subsystem is the associated radiation field. In another example Majumdar , the subsystem is a set of spins and the subsystem represents the environment of a heat bath.
A measure of the entanglement of the considered bipartite quantum system is the von Neumann entropy
[TABLE]
where . Its mean value was conjectured by Page Page (1993) as
[TABLE]
where denotes that the expectation is taken over the fixed-trace ensemble (2). Here, is the digamma function (Psi function) Luke (1969) and for a positive integer ,
[TABLE]
where is the Euler’s constant. The mean value formula (5) was proved independently by Foong-Kanno Foong and Kanno (1994), Sánchez-Ruiz Sánchez-Ruiz (1995), Sen Sen (1996), and Adachi-Toda-Kubotani Adachi et al. (2009). For the orthogonal and symplectic fixed-trace ensembles, the mean formulas of the von Neumann entropy were derived by Kumar-Pandey Kumar and Pandey (2011).
To gain more insights, one needs to know the fluctuation of the von Neumann entropy. In fact, its mean value turns out to be a poor representative that has led to an incorrect conclusion on the full distribution Page (1993). Recently, Vivo, Pato, and Oshanin conjectured (Vivo et al., 2016, eq. (57)), based on small and calculations from some complicated representations (Vivo et al., 2016, eqs. (54)–(56), (A3), (A9)), that the variance of the von Neumann entropy equals
[TABLE]
where is the trigamma function Luke (1969)111The digamma and trigamma functions are the polygamma functions of order zero and one, respectively. and for a positive integer ,
[TABLE]
In this paper, we show that the conjecture (7) of Vivo-Pato-Oshanin (VPO) is indeed correct. The presentation of the proof is organized as follows. In Sec. II, we relate the variance of the von Neumann entropy to that of an induced one over the Laguerre ensemble, which is calculated explicitly. The derived induced variance is simplified to functions involving digamma and trigamma functions in Sec. III that leads to a proof of the conjecture. Most of the technical tools for the simplification are presented in the Appendix. Finally, we point out that even though the exact distribution of von Neumann entropy is unknown, its asymptotic distribution was obtained via the Coulomb gas approach by Nadal-Majumdar-Vergassola Nadal et al. (2011).
II Variance of an Induced Entropy in Laguerre Ensemble
II.1 Variance Relation
By the construction (1), the random coefficient matrix has a natural relation with a Wishart matrix as
[TABLE]
where is an () matrix of independently and identically distributed complex Gaussian entries. The density of the eigenvalues of equals Forrester (2010)
[TABLE]
where is given by (3) and the above ensemble is known as the Laguerre ensemble. The trace of the Wishart matrix
[TABLE]
follows a gamma distribution with the density Vivo et al. (2016)
[TABLE]
The relation (9) induces the change of variables
[TABLE]
that leads to a well-known relation (see, e.g. Page (1993)) among the densities (2), (10), and (12) as
[TABLE]
This relation implies that is independent of each , , since their densities factorize.
Page Page (1993) exploited the relation (14) by relating the first moment of von Neumann entropy over the fixed-trace ensemble (2) to that of an induced entropy 222For convenience of the discussion, we refer the random variable as an induced entropy, which may not have physical meaning of an entropy.
[TABLE]
over the Laguerre ensemble (10) as follows. First, by using the relations (13), one has
[TABLE]
Then, the expected value of is evaluated as
[TABLE]
where the expectation is taken over the Laguerre ensemble (10). Here, (17) is obtained by the identity and the fact that is independent of , and (18) is established by the change of measures (14) and the identity
[TABLE]
Sánchez-Ruiz Sánchez-Ruiz (1995) and Sen Sen (1996) have calculated that
[TABLE]
and together with the relation (18) leads to their proofs of Page’s conjecture on the mean entropy (5).
We now show that the idea of Page Page (1993) can be generalized to find a relation between the second moments (hence the variances since the first moments are known) of and , which is the starting point of our calculations. First, using the result (16) we have
[TABLE]
The expression (22) is obtained by replacing only the first power of in (21) by the identity (16), and the reason for this replacement will become clear. The second moment of can now be written as
[TABLE]
To utilize the independence between and , we multiple (23) by an appropriate constant , which, with the fact that , leads to
[TABLE]
From the second line of the above equation, we see that the replacement of the first power of by in (21) makes it possible to evaluate the integrals over and separately. Finally, using the change of measures (14) as well as the identities (19) and
[TABLE]
we arrive at
[TABLE]
Inserting the mean formula (5) and the VPO’s conjecture (7) into the definition , and equating it to the derived relation (25), the VPO’s conjecture boils down to showing that is given by
[TABLE]
where we have used the identities (cf. (6) and (8))
[TABLE]
for the case , .
We have so far converted the VPO’s conjecture (7) evaluated over the fixed-trace ensemble (2) to an equivalent conjecture (26) evaluated over the Laguerre ensemble (10). Instead of working directly with the complicated correlation functions of the fixed-trace ensemble as in Adachi et al. (2009); Kumar and Pandey (2011); Vivo et al. (2016), the induced variance over the well-investigated correlation functions of the Laguerre ensemble can be explicitly calculated as will be shown in Sec. II.2. The proposed ‘moments conversion’ approach may be generalized to study the higher moments of the von Neumann entropy as well as other entanglement measures such as the Tsallis entropy and the Rényi entropy.
II.2 Calculations of the Induced Variance
Since , the calculation of involves one and two arbitrary eigenvalue densities, denoted respectively by and , of the Laguerre ensemble as
[TABLE]
In general, the joint density of arbitrary eigenvalues is related to the -point correlation function
[TABLE]
as Forrester (2010) , where is the matrix determinant and the symmetric function is the correlation kernel. In particular, we have
[TABLE]
As a result, one can represent (II.2) as
[TABLE]
where
[TABLE]
and we have used the result (20) and the definition
[TABLE]
Before computing the integrals and , the following results on the correlation functions (29) are needed. The correlation kernel of the Laguerre ensemble can be explicitly written as Forrester (2010)
[TABLE]
where
[TABLE]
with
[TABLE]
being the (generalized) Laguerre polynomial of degree . The Laguerre polynomials satisfy the well-known orthogonality relation Forrester (2010)
[TABLE]
where is the Kronecker delta function. It is known that the one-point correlation function (cf. (29)) admits a more convenient representation as Sánchez-Ruiz (1995); Forrester (2010)
[TABLE]
We also need the following identity, due to Schrödinger Schrödinger (1926), that generalizes the integral (36) to
[TABLE]
By taking the first and second derivative on both sides of (II.2) with respect to , we obtain two more integral identities as shown in (II.2.1) (see also Sánchez-Ruiz (1995)) and (II.2.1), which are respectively denoted by and . With the above preparations, we now proceed to the calculations of in (31) and in (32).
II.2.1 Calculating
By the fact that (cf. (29))
[TABLE]
one inserts (37) into (31) to obtain
[TABLE]
where for convenience we have further defined (cf. (II.2.1))
[TABLE]
We now use (II.2.1), and the contribution to the sum
[TABLE]
consists of the cases when the binomial terms are zero () with the polygamma functions being infinity and are nonzero () with the polygamma functions being finite. Namely, we have
[TABLE]
[TABLE]
[TABLE]
which by interpreting the gamma and polygamma functions of negative integer arguments as the limit of
[TABLE]
leads to a well-defined limit
[TABLE]
In the same manner that has led to , we obtain
[TABLE]
Finally, we insert (43), (II.2.1), (II.2.1) into (40) and simplify the expression by rearranging the sums as well as using (27) to obtain
[TABLE]
II.2.2 Calculating
Inserting (33) into (32) and using the symmetry of the correlation kernel, the integral can be represented as
[TABLE]
where we have further defined (cf. (II.2.1))
[TABLE]
The identity (II.2.1) gives
[TABLE]
where provides the nonzero contribution to the sum and we have used (27a) for the simplification. In the same manner, one obtains
[TABLE]
and the cases are computed to be
[TABLE]
Inserting (52), (53), and (54) into (50), we arrive at
[TABLE]
III Simplification of Summations
The remaining task is to simplify the sums appear in and to polygamma functions. This is a straightforward but tedious task, for which we need several finite sum identities as listed in the Appendix. Some remarks on these identities are also provided in the Appendix. Though in (49) and in (55) are valid for any positive integers and with , as will be seen it is convenient to assume in the following simplification. For this reason, we will first simplify and in the case . The remaining special cases will be considered at the end of this section.
For ease of presentation, we cite the identities used in each step on top of the equality symbol. The argument of each of the resulting polygamma functions is shifted to one of the following , , , , with the help of (27). In addition, simplification by combining like terms is also performed in each step without being explicitly mentioned. We start with in (49), where by using partial fraction decomposition the first sum is simplified as
[TABLE]
Similarly, the second sum in (49) is simplified as
[TABLE]
Inserting (56) and (57) into (49), is simplified to
[TABLE]
[TABLE]
We now simplify in (55), where the first two sums are
[TABLE]
The remaining double sums in needs some preprocessing before the sum of the types in the appendix appear. Specifically, by shifting the inner sum , changing the summation order, and using partial fraction decomposition, we have
[TABLE]
where and collect terms involving and , respectively, as (the terms involving cancel)
[TABLE]
The sums in are further simplified as
[TABLE]
The sums in are further simplified as
[TABLE]
where we also changed the summation order between and to arrive at the last equality, and , , are
[TABLE]
With and being simplified as in (66) and (67), respectively, we now insert (62) and (63) into (55) to obtain
[TABLE]
[TABLE]
We observe that in (58) and in (71) share many common terms, where by inserting (58) and (71) into (30) the remaining terms of the induced variance are
[TABLE]
where we have used the results
[TABLE]
obtained by comparing (59)–(61) to (72)–(74). This completes the proof of the induced conjecture (26) in the case and hence the VPO’s conjecture (7) for the same case.
Since , the remaining cases to be shown are , , and , where in (49) and in (55) can be directly computed. We list the simplified expressions for , , and the induced variance in Table 1 as shown on top of the next page. Each of the special cases is proven by comparing the expression of in Table 1 with that of the corresponding induced conjecture (26). We complete the proof of the VPO’s conjecture (7).
Acknowledgements.
The author wishes to thank Michael Milgram, Gregory Schehr, and Yu Xiang for the inspiring discussion.
Appendix A Finite Sum Identities Useful in Section III
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
Some Remarks on the Identities in the Appendix
The formulas of finite sums of polygamma functions of the types (75)–(83) are straightforward to show. The proofs essentially involve changing the order of the sums and making use of the lower order sums already obtained in a recursive manner. In particular, the formulas (75)–(78) are available in (Brychkov, 2008, ch. 5.1). The formulas (79)–(83) can be read off from the expressions in (Spieß, 1990, p. 861) by keeping in mind the difference between polygamma functions (6), (8) and harmonic numbers.
The last three formulas (84)–(86) play a crucial role in the simplification in Sec. III as they connect some of the sums in (49) and (55) to polygamma functions. The first of them (84) is known as Chu-Vandermonde identity (Luke, 1969, p. 99). The next formula (85) can be established as follows. First, the identity (27a) implies that
[TABLE]
By using the definition of digamma function (6), changing the order of sums, and evoking Chu-Vandermonde identity (84), the first term in (87) is represented as
[TABLE]
Similarly, we have
[TABLE]
Inserting (A) and (A) into (87), we obtain a recurrence relation of the sum (85) as
[TABLE]
where we denote
[TABLE]
Finally, by iterating times the relation (90), we arrive at
[TABLE]
where we have used the fact that . Note that the formula (85) can be also obtained via its connection to a hypergeometric function of unit argument as (Luke, 1969, p. 111)
[TABLE]
To prove the last formula (86), we first observe from (27b) that
[TABLE]
Following the same idea that has led to (90), we also obtain a recurrence relation in this case as
[TABLE]
where we denote
[TABLE]
Iterating times the relation (94), we arrive at
[TABLE]
where by using the identity (Milgram, 2004, eq. (23))
[TABLE]
we obtain the claimed formula (86). Though the expression (86) still contains a sum of digamma functions that may not be further simplified, it is sufficient for the simplification purpose. As shown in Sec. III, the terms involving this remaining sum cancel each other. Finally, we note that as a result of the relation to the hypergeometric function
[TABLE]
the formula (86) implies a byproduct that generalizes a result of Luke (Luke, 1969, p. 111) as
[TABLE]
which may be of independent interest.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Page (1993) D. N. Page, Phys. Rev. Lett. 71 , 1291 (1993).
- 2(2) S. N. Majumdar, in The Oxford Handbook of Random Matrix Theory , edited by G. Akemann, J. Baik, and P. Di Francesco, Chap. 37.
- 3Luke (1969) Y. L. Luke, The Special Functions and Their Approximations , Vol. 1 (Academic Press, New York, 1969).
- 4Foong and Kanno (1994) S. K. Foong and S. Kanno, Phys. Rev. Lett. 72 , 1148 (1994).
- 5Sánchez-Ruiz (1995) J. Sánchez-Ruiz, Phys. Rev. E 52 , 5653 (1995).
- 6Sen (1996) S. Sen, Phys. Rev. Lett. 77 , 1 (1996).
- 7Adachi et al. (2009) S. Adachi, M. Toda, and H. Kubotani, Ann. Phys. 324 , 2278 (2009).
- 8Kumar and Pandey (2011) S. Kumar and A. Pandey, J. Phys. A: Math. Theor. 44 , 445301 (2011).
