Analysing causal structures using Tsallis entropies

V. Vilasini; Roger Colbeck

arXiv:1907.02551·quant-ph·February 9, 2021

Analysing causal structures using Tsallis entropies

V. Vilasini, Roger Colbeck

PDF

TL;DR

This paper explores the use of Tsallis entropies to analyze causal structures, revealing new constraints and mathematical properties, but also highlighting significant computational challenges in distinguishing classical and quantum correlations.

Contribution

It introduces Tsallis entropy-based constraints for causal structures and generalizes known Shannon constraints, advancing understanding of classical and quantum correlations.

Findings

01

Derived Tsallis entropy constraints for causal structures.

02

Identified computational limitations of the entropy vector method.

03

Discovered new mathematical properties of Tsallis entropies.

Abstract

Understanding cause-effect relationships is a crucial part of the scientific process. As Bell's theorem shows, within a given causal structure, classical and quantum physics impose different constraints on the correlations that are realisable, a fundamental feature that has technological applications. However, in general it is difficult to distinguish the set of classical and quantum correlations within a causal structure. Here we investigate a method to do this based on using entropy vectors for Tsallis entropies. We derive constraints on the Tsallis entropies that are implied by (conditional) independence between classical random variables and apply these to causal structures. We find that the number of independent constraints needed to characterise the causal structure is prohibitively high such that the computations required for the standard entropy vector method cannot be employed…

Tables1

Table 1. Table 1: Values of d i superscript 𝑑 𝑖 d^{i} for the chained Bell and magic square correlations embedded in the triangle causal structure. The values of N 𝑁 N correspond to the number of inputs per party in the chained Bell inequality, which always has two outputs per party (the N = 2 𝑁 2 N=2 case corresponds to Fritz’s distribution Fritz2012 ). When embedded in the triangle, the number of outcomes of the observed nodes are ( d X , d Y , d Z ) = ( 2 N , 2 N , N 2 ) subscript 𝑑 𝑋 subscript 𝑑 𝑌 subscript 𝑑 𝑍 2 𝑁 2 𝑁 superscript 𝑁 2 (d_{X},d_{Y},d_{Z})=(2N,2N,N^{2}) . The last column of the table gives the minimum of the observed node dimensions ( d X , d Y , d Z ) subscript 𝑑 𝑋 subscript 𝑑 𝑌 subscript 𝑑 𝑍 (d_{X},d_{Y},d_{Z}) for each N 𝑁 N , which is simply 2 N 2 𝑁 2N . For the magic square, the dimensions ( d X , d Y , d Z ) subscript 𝑑 𝑋 subscript 𝑑 𝑌 subscript 𝑑 𝑍 (d_{X},d_{Y},d_{Z}) are ( 12 , 12 , 9 ) 12 12 9 (12,12,9) . In all cases, the minimum value of d i superscript 𝑑 𝑖 d^{i} such that the Inequalities ( 13a )–( 13c ) with bounds B i ( q , d o = d i , d u = 2 ) B_{i}(q,d_{o}=d^{i},d_{u}=2) are not violated for any q ≥ 1 𝑞 1 q\geq 1 is less than the minimum observed dimension d o min superscript subscript 𝑑 𝑜 d_{o}^{\min} , and hence no violations of ( 13a )–( 13c ) could be found for the relevant case with d o = d o min subscript 𝑑 𝑜 subscript superscript 𝑑 𝑜 d_{o}=d^{\min}_{o} .

	$d^{i}$			smallest
Scenario	Ineq. (13a) ( $i = 1$ )	Ineq. (13b) ( $i = 2$ )	Ineq. (13c) ( $i = 3$ )	observed dim. ( $d_{o}^{\min}$ )
$N = 2$	2	2	2	4
$N = 3$	3	2	3	6
$N = 4$	4	2	4	8
$N = 5$	5	2	5	10
$N = 6$	6	2	6	12
$N = 7$	7	2	7	14
$N = 8$	8	2	8	16
$N = 9$	9	3	9	18
$N = 10$	10	3	10	20
Magic Sq.	4	2	4	9

Equations146

p_{X_{1}, \dots, X_{n}} = i = 1 \prod n p_{X_{i} ∣ X_{i}^{↓_{1}}},

p_{X_{1}, \dots, X_{n}} = i = 1 \prod n p_{X_{i} ∣ X_{i}^{↓_{1}}},

I (X_{i} : X_{i}^{\nuparrow} ∣ X_{i}^{↓_{1}}) = 0 \forall i \in {1, \dots, n} .

I (X_{i} : X_{i}^{\nuparrow} ∣ X_{i}^{↓_{1}}) = 0 \forall i \in {1, \dots, n} .

S_{q} (X) = {- \sum_{{x : p_{x} > 0}} p_{x}^{q} ln_{q} p_{x} H (X) if q \neq = 1 if q = 1

S_{q} (X) = {- \sum_{{x : p_{x} > 0}} p_{x}^{q} ln_{q} p_{x} H (X) if q \neq = 1 if q = 1

S_{q} (X ∣ Y) := {- \sum_{x, y} p_{x y}^{q} ln_{q} p_{x ∣ y} H (X ∣ Y) if q \neq = 1 if q = 1

S_{q} (X ∣ Y) := {- \sum_{x, y} p_{x y}^{q} ln_{q} p_{x ∣ y} H (X ∣ Y) if q \neq = 1 if q = 1

I_{q} (X : Y) = S_{q} (X) - S_{q} (X ∣ Y), I_{q} (X : Y ∣ Z) = S_{q} (X ∣ Z) - S_{q} (X ∣ Y Z) .

I_{q} (X : Y) = S_{q} (X) - S_{q} (X ∣ Y), I_{q} (X : Y ∣ Z) = S_{q} (X ∣ Z) - S_{q} (X ∣ Y Z) .

S_{q} (X Y) = S_{q} (X) + S_{q} (Y) + (1 - q) S_{q} (X) S_{q} (Y) .

S_{q} (X Y) = S_{q} (X) + S_{q} (Y) + (1 - q) S_{q} (X) S_{q} (Y) .

S_{q} (X) \leq S_{q} (X Y) .

S_{q} (X) \leq S_{q} (X Y) .

S_{q} (X Y Z) + S_{q} (Z) \leq S_{q} (X Z) + S_{q} (Y Z) .

S_{q} (X Y Z) + S_{q} (Z) \leq S_{q} (X Z) + S_{q} (Y Z) .

S_{q} (X_{1}, X_{2}, \dots, X_{n} ∣ Y) = i = 1 \sum n S_{q} (X_{i} ∣ X_{i - 1}, \dots, X_{1}, Y) .

S_{q} (X_{1}, X_{2}, \dots, X_{n} ∣ Y) = i = 1 \sum n S_{q} (X_{i} ∣ X_{i - 1}, \dots, X_{1}, Y) .

I_{q} (X : Y) I_{q} (X : Y ∣ Z) = S_{q} (X) + S_{q} (Y) - S_{q} (X Y), = S_{q} (X Z) + S_{q} (Y Z) - S_{q} (Z) - S_{q} (X Y Z) .

I_{q} (X : Y) I_{q} (X : Y ∣ Z) = S_{q} (X) + S_{q} (Y) - S_{q} (X Y), = S_{q} (X Z) + S_{q} (Y Z) - S_{q} (Z) - S_{q} (X Y Z) .

I_{q} (X : Y) \leq f (q, d_{X}, d_{Y}),

I_{q} (X : Y) \leq f (q, d_{X}, d_{Y}),

f (q, d_{X}, d_{Y}) = \frac{1}{( q - 1 )} (1 - \frac{1}{d _{X}^{q - 1}}) (1 - \frac{1}{d _{Y}^{q - 1}}) = (q - 1) ln_{q} d_{X} ln_{q} d_{Y} .

f (q, d_{X}, d_{Y}) = \frac{1}{( q - 1 )} (1 - \frac{1}{d _{X}^{q - 1}}) (1 - \frac{1}{d _{Y}^{q - 1}}) = (q - 1) ln_{q} d_{X} ln_{q} d_{Y} .

I_{q} (X : Y) = S_{q} (X) + S_{q} (Y) - S_{q} (X Y) = (q - 1) S_{q} (X) S_{q} (Y) \leq \frac{( 1 - \frac{1}{d _{X}^{q - 1}} ) ( 1 - \frac{1}{d _{Y}^{q - 1}} )}{q - 1} = f (q, d_{X}, d_{Y}) .

I_{q} (X : Y) = S_{q} (X) + S_{q} (Y) - S_{q} (X Y) = (q - 1) S_{q} (X) S_{q} (Y) \leq \frac{( 1 - \frac{1}{d _{X}^{q - 1}} ) ( 1 - \frac{1}{d _{Y}^{q - 1}} )}{q - 1} = f (q, d_{X}, d_{Y}) .

I_{q} (X : Y ∣ Z) \leq f (q, d_{X}, d_{Y}) .

I_{q} (X : Y ∣ Z) \leq f (q, d_{X}, d_{Y}) .

I_{q} (X : Y ∣ Z)

I_{q} (X : Y ∣ Z)

\displaystyle=\frac{1}{q-1}\sum_{z}p^{q}(z)\big{[}\sum_{xy}p^{q}(xy|z)+1-\sum_{x}p^{q}(x|z)-\sum_{y}p^{q}(y|z)\big{]}

= z \sum p^{q} (z) I_{q} (X : Y)_{p_{X Y ∣ Z = z}} .

p_{X Y Z} = p_{Z} p_{X ∣ Z} p_{Y ∣ Z} max I_{q} (X : Y ∣ Z)

p_{X Y Z} = p_{Z} p_{X ∣ Z} p_{Y ∣ Z} max I_{q} (X : Y ∣ Z)

\leq p_{Z} max z \sum p_{z}^{q} p_{X ∣ Z} p_{Y ∣ Z} max I_{q} (X : Y)_{p_{X Y ∣ Z = z}}

= p_{Z} max z \sum p_{z}^{q} f (q, d_{X}, d_{Y}) = f (q, d_{X}, d_{Y}) .

p_{X Y Z} p_{X Y ∣ Z} = p_{X ∣ Z} p_{Y ∣ Z} max I_{q} (X : Y ∣ Z) = p_{X Y} p_{X Y} = p_{X} p_{Y} max I_{q} (X : Y) .

p_{X Y Z} p_{X Y ∣ Z} = p_{X ∣ Z} p_{Y ∣ Z} max I_{q} (X : Y ∣ Z) = p_{X Y} p_{X Y} = p_{X} p_{Y} max I_{q} (X : Y) .

I_{q} (X : Y ∣ Z) \leq f (q, d_{X}, d_{Y}),

I_{q} (X : Y ∣ Z) \leq f (q, d_{X}, d_{Y}),

- H (X) - H (Y) - H (Z) + H (X Y) + H (X Z) \geq 0,

- H (X) - H (Y) - H (Z) + H (X Y) + H (X Z) \geq 0,

- 5 H (X) - 5 H (Y) - 5 H (Z) + 4 H (X Y) + 4 H (X Z) + 4 H (Y Z) - 2 H (X Y Z) \geq 0,

- 3 H (X) - 3 H (Y) - 3 H (Z) + 2 H (X Y) + 2 H (X Z) + 3 H (Y Z) - H (X Y Z) \geq 0.

- S_{q} (X) - S_{q} (Y) - S_{q} (Z) + S_{q} (X Y) + S_{q} (X Z) \geq B_{1} (q, d_{o}, d_{u}),

- S_{q} (X) - S_{q} (Y) - S_{q} (Z) + S_{q} (X Y) + S_{q} (X Z) \geq B_{1} (q, d_{o}, d_{u}),

\displaystyle\begin{split}-5S_{q}(X)-5S_{q}(Y)-5S_{q}(Z)+4S_{q}(XY)+4S_{q}(XZ)+4S_{q}(YZ)-2S_{q}(XYZ)\\ \geq B_{2}(q,d_{o},d_{u}):=\max\big{(}B_{21}(q,d_{o},d_{u}),B_{22}(q,d_{o},d_{u})\big{)},\end{split}

\displaystyle\begin{split}-5S_{q}(X)-5S_{q}(Y)-5S_{q}(Z)+4S_{q}(XY)+4S_{q}(XZ)+4S_{q}(YZ)-2S_{q}(XYZ)\\ \geq B_{2}(q,d_{o},d_{u}):=\max\big{(}B_{21}(q,d_{o},d_{u}),B_{22}(q,d_{o},d_{u})\big{)},\end{split}

- 3 S_{q} (X) - 3 S_{q} (Y) - 3 S_{q} (Z) + 2 S_{q} (X Y) + 2 S_{q} (X Z) + 3 S_{q} (Y Z) - S_{q} (X Y Z) \geq B_{3} (q, d_{o}, d_{u}),

- 3 S_{q} (X) - 3 S_{q} (Y) - 3 S_{q} (Z) + 2 S_{q} (X Y) + 2 S_{q} (X Z) + 3 S_{q} (Y Z) - S_{q} (X Y Z) \geq B_{3} (q, d_{o}, d_{u}),

B_{1}(q,d_{o},d_{u})=-\frac{1}{q-1}\Bigg{(}1-d_{o}^{1-q}\Bigg{)}\Bigg{(}2-d_{o}^{1-q}-d_{u}^{1-q}\Bigg{)},

B_{1}(q,d_{o},d_{u})=-\frac{1}{q-1}\Bigg{(}1-d_{o}^{1-q}\Bigg{)}\Bigg{(}2-d_{o}^{1-q}-d_{u}^{1-q}\Bigg{)},

\displaystyle\begin{split}B_{21}(q,d_{o},d_{u})&=-\frac{1}{q-1}\Bigg{(}11+d_{u}^{3-3q}+6d_{o}^{2-2q}+3d_{o}^{1-q}d_{u}^{1-q}-6d_{u}^{1-q}-15d_{o}^{1-q}\Bigg{)},\\ B_{22}(q,d_{o},d_{u})&=-\frac{1}{q-1}\Bigg{(}10+d_{o}^{1-q}d_{u}^{3-3q}+5d_{o}^{2-2q}+2d_{o}^{1-q}d_{u}^{1-q}-5d_{u}^{1-q}-13d_{o}^{1-q}\Bigg{)},\end{split}

\displaystyle\begin{split}B_{21}(q,d_{o},d_{u})&=-\frac{1}{q-1}\Bigg{(}11+d_{u}^{3-3q}+6d_{o}^{2-2q}+3d_{o}^{1-q}d_{u}^{1-q}-6d_{u}^{1-q}-15d_{o}^{1-q}\Bigg{)},\\ B_{22}(q,d_{o},d_{u})&=-\frac{1}{q-1}\Bigg{(}10+d_{o}^{1-q}d_{u}^{3-3q}+5d_{o}^{2-2q}+2d_{o}^{1-q}d_{u}^{1-q}-5d_{u}^{1-q}-13d_{o}^{1-q}\Bigg{)},\end{split}

B_{3}(q,d_{o},d_{u})=-\frac{1}{q-1}\Bigg{(}6+d_{o}^{1-q}d_{u}^{2-2q}+3d_{o}^{2-2q}+d_{o}^{1-q}d_{u}^{1-q}-3d_{u}^{1-q}-8d_{o}^{1-q}\Bigg{)}.

B_{3}(q,d_{o},d_{u})=-\frac{1}{q-1}\Bigg{(}6+d_{o}^{1-q}d_{u}^{2-2q}+3d_{o}^{2-2q}+d_{o}^{1-q}d_{u}^{1-q}-3d_{u}^{1-q}-8d_{o}^{1-q}\Bigg{)}.

\displaystyle\begin{split}B_{1}^{*}(q,d_{o})&=B_{1}(q,d_{o},d_{o}^{3}-d_{o})\\ &=-\frac{1}{q-1}\Bigg{(}2+d_{o}^{2-2q}-3d_{o}^{1-q}+d_{o}(-d_{o}+d_{o}^{3})^{-q}-d_{o}^{3}(-d_{o}+d_{o}^{3})^{-q}-d_{o}^{2-q}(-d_{o}+d_{o}^{3})^{-q}+d_{o}^{4-q}(-d_{o}+d_{o}^{3})^{-q}\Bigg{)}\end{split}

\displaystyle\begin{split}B_{1}^{*}(q,d_{o})&=B_{1}(q,d_{o},d_{o}^{3}-d_{o})\\ &=-\frac{1}{q-1}\Bigg{(}2+d_{o}^{2-2q}-3d_{o}^{1-q}+d_{o}(-d_{o}+d_{o}^{3})^{-q}-d_{o}^{3}(-d_{o}+d_{o}^{3})^{-q}-d_{o}^{2-q}(-d_{o}+d_{o}^{3})^{-q}+d_{o}^{4-q}(-d_{o}+d_{o}^{3})^{-q}\Bigg{)}\end{split}

\displaystyle\begin{split}B_{21}^{*}(q,d_{o})&=B_{21}(q,d_{o},d_{o}^{3}-d_{o})\\ &=-\frac{1}{q-1}\Bigg{(}11+6d_{o}^{2-2q}-15d_{o}^{1-q}+(-d_{o}+d_{o}^{3})^{3-3q}-6(-d_{o}+d_{o}^{3})^{1-q}+3d_{o}^{1-q}(-d_{o}+d_{o}^{3})^{1-q}\Bigg{)},\\ B_{22}^{*}(q,d_{o})&=B_{22}(q,d_{o},d_{o}^{3}-d_{o})\\ &=-\frac{1}{q-1}\Bigg{(}10+5d_{o}^{2-2q}-13d_{o}^{1-q}+d_{o}^{1-q}(-d_{o}+d_{o}^{3})^{3-3q}-5(-d_{o}+d_{o}^{3})^{1-q}+2d_{o}^{1-q}(-d_{o}+d_{o}^{3})^{1-q}\Bigg{)},\end{split}

\displaystyle\begin{split}B_{21}^{*}(q,d_{o})&=B_{21}(q,d_{o},d_{o}^{3}-d_{o})\\ &=-\frac{1}{q-1}\Bigg{(}11+6d_{o}^{2-2q}-15d_{o}^{1-q}+(-d_{o}+d_{o}^{3})^{3-3q}-6(-d_{o}+d_{o}^{3})^{1-q}+3d_{o}^{1-q}(-d_{o}+d_{o}^{3})^{1-q}\Bigg{)},\\ B_{22}^{*}(q,d_{o})&=B_{22}(q,d_{o},d_{o}^{3}-d_{o})\\ &=-\frac{1}{q-1}\Bigg{(}10+5d_{o}^{2-2q}-13d_{o}^{1-q}+d_{o}^{1-q}(-d_{o}+d_{o}^{3})^{3-3q}-5(-d_{o}+d_{o}^{3})^{1-q}+2d_{o}^{1-q}(-d_{o}+d_{o}^{3})^{1-q}\Bigg{)},\end{split}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Analysing causal structures using Tsallis entropies

V. Vilasini

[email protected]

Department of Mathematics, University of York, Heslington, York YO10 5DD.

Roger Colbeck

[email protected]

Department of Mathematics, University of York, Heslington, York YO10 5DD.

Abstract

Understanding cause-effect relationships is a crucial part of the scientific process. As Bell’s theorem shows, within a given causal structure, classical and quantum physics impose different constraints on the correlations that are realisable, a fundamental feature that has technological applications. However, in general it is difficult to distinguish the set of classical and quantum correlations within a causal structure. Here we investigate a method to do this based on using entropy vectors for Tsallis entropies. We derive constraints on the Tsallis entropies that are implied by (conditional) independence between classical random variables and apply these to causal structures. We find that the number of independent constraints needed to characterise the causal structure is prohibitively high such that the computations required for the standard entropy vector method cannot be employed even for small causal structures. Instead, without solving the whole problem, we find new Tsallis entropic constraints for the triangle causal structure by generalising known Shannon constraints. Our results reveal new mathematical properties of classical and quantum Tsallis entropies and highlight difficulties of using Tsallis entropies for analysing causal structures.

I Introduction

Cause-effect relationships between physical systems constrain the correlations that can arise between them. The study of causality allows us to explain observed correlations between different variables in terms of unobserved systems that cause these variables to become correlated. This has found applications in diverse fields of research such as medical testing, socio-economic surveys and physics. The foundational interest in causal structures stems from the fact that the theory that describes the unobserved systems affects the set of possible correlations over the observed variables. Bell inequalities Bell are constraints on the observed correlations in a classical causal structure (Figure 1a) and can be violated in quantum and generalised probabilistic theories (GPTs). The possibility of such violations leads to applications in device-independent cryptography Ekert91 ; MayersYao ; BHK ; RogerThesis ; CK ; Pironio2009 .

In the bipartite Bell causal structure (Figure 1a), the set of all joint conditional distributions $P_{XY|AB}$ over the observed nodes $X,Y,A,B$ that can arise when $\Lambda$ is classical is relatively well understood. For fixed input and output sizes, it forms a convex polytope and hence membership can be checked using a linear program (although the size of the linear program scales exponentially with the number of inputs and the problem is NP-complete Pitowski89 ). Because of this, the complete set of Bell inequalities characterizing these polytopes are unknown for $|X|,|Y|>3$ or $|A|,|B|>5$ Masanes2002 ; Bancal2010 ; Cope2019 .

In causal structures with more unobserved common causes (such as the triangle causal structure of Figure 1b), the set of compatible correlations is not well understood. The inflation technique wolfe2016 can in principle certify whether or not a given distribution belongs to the classical marginal entropy cone111The set of possible entropy vectors over the observed nodes of the classical causal structure. of a causal structure Navascues2017 . However, the method does not tell us how to construct a suitable inflation of the causal structure in order to achieve this, or how large this inflation needs to be. Thus, in general, using the inflation technique becomes intractable in practice. The difficulty of solving the general problem in part stems from its non-convexity. One approach to overcoming this is to analyse the problem in entropy space Yeung97 . This has proven to be useful in a number of cases (see, e.g., Fritz13 ; Chaves13 , or Weilenmann20170483 for a detailed review), since the problem is convex in entropy space and the entropic inequalities characterising the relevant sets are independent of the number of measurement outcomes. However, it was shown in Weilenmann16 that the entropy vector method with Shannon entropies cannot detect the classical-quantum gap for line-like causal structures.222Note that this result holds for the entropic characterisation without post-selection. Using the post-selection technique (see e.g., Weilenmann20170483 for an explanation), one can derive quantum-violatable Shannon entropic inequalities even for line-like causal structures BraunsteinCaves88 however this technique is not generalizable to causal structures that have no parentless observed nodes, such as in Figure 1b. Further, even though new Shannon entropic inequalities have been derived using this method, no quantum violation of these have been found for a range of causal structures where non-classical correlations are known to exist Weilenmann2018 ; Chaves2014 . Due to these limitations of Shannon entropies, it is natural to ask whether other entropic quantities could do better.

Here we consider Tsallis entropies in the entropy vector method for analysing causal structures. One motivation for considering such entropies for the task is that they are a family with an additional (real) parameter. The set of entropies for all possible values of this parameter conveys more information about the underlying probability distribution than a single member of the family and hence the ability to vary a parameter may give advantages for analysing causal structures. Tsallis entropies appear to be a good candidate since they satisfy monotonicity, submodularity and the chain rule which are desirable properties for their use in the entropy vector method.333Other examples of more general entropy measures such as the Rényi entropy renyi do not satisfy one or more of these properties, making it more difficult to get entropic constraints on them using the entropy vector method. Tsallis entropies have been considered in the context of causal structures before Wajs15 where they were shown to give an advantage over Shannon entropy in detecting the non-classicality of certain states in the Bell scenario if one also post-selects on the values of observed parentless nodes444Note that non-classicality cannot be detected entropically in the Bell causal structure (Figure 1a) without post-selection Weilenmann16 .. Here we consider a systematic treatment that can be applied to an arbitrary causal structure in the absence of post-selection. (Note that use of post-selection is not possible in causal structures with no observed parentless nodes such as the Triangle of Figure 1b.)

In Section IV, we derive the constraints on the classical Tsallis entropies that are implied by a given causal structure and in Appendix A, we generalise this result to quantum Tsallis entropies for certain cases. In Section IV.2, we use these constraints in the entropy vector method with Tsallis entropies but find that the computational procedure becomes too time consuming even for simple causal structures such as the bipartite Bell scenario. Despite this limitation, we derive new Tsallis entropic inequalities for the triangle causal structure in Section V, using known Shannon entropic inequalities of Chaves2014 and our Tsallis constraints of Section IV. In Section VI, we discuss the reasons for the computational difficulty of this method, the drawbacks of using Tsallis entropies for analysing causal structure and identify potential future directions.

II Shannon entropy and the entropy vector method

Given a random variable $X$ distributed according to the discrete probability distribution555We will only be considering random variables defined on a finite set in this paper. $p_{X}$ , the Shannon entropy of $X$ is given by $H(X)=-\sum_{x}p_{X}(x)\ln p_{X}(x)$ .666Note that it is common to take logarithms in base 2 and measure entropy in bits; here we use base $e$ corresponding to measuring entropy in nats. Given two random variables $X$ and $Y$ , distributed according to $P_{XY}$ the conditional Shannon entropy is defined by $H(X|Y)=-\sum_{x,y}p_{XY}(xy)\ln\frac{p_{XY}(xy)}{p_{Y}(y)}$ and the Shannon mutual information by $I(X:Y)=H(X)-H(X|Y)$ . For three random variables $X$ , $Y$ and $Z$ , we can also define the mutual information between $X$ and $Y$ conditioned on $Z$ , $I(X:Y|Z)=H(X|Z)-H(X|YZ)$ .

We will sometimes use the shorthands $p_{x}=p_{X}(x)=p(X=x)$ and $p_{x|y}:=p_{X|Y}(X=x|Y=y)$ etc. for probability distributions.

We next provide a short overview of the entropy vector method that suffices for the purposes of this paper. For a more detailed overview of the method, see Weilenmann20170483 . Consider a joint distribution $p_{X_{1},\ldots,X_{n}}$ over $n$ random variables $X_{1},X_{2},\ldots,X_{n}$ . With each such distribution, we associate a vector with $2^{n}-1$ components, each of which correspond to the entropy of an element of the powerset of $\{X_{1},X_{2},\ldots,X_{n}\}$ (excluding the empty set). This defines the entropy vector of $p_{X_{1},\ldots,X_{n}}$ . Note that this vector encodes the conditional entropies and mutual informations via the relations $H(X|Y)=H(XY)-H(Y)$ , $I(X:Y)=H(X)+H(Y)-H(XY)$ and $I(X:Y|Z)=H(XZ)+H(YZ)-H(XYZ)-H(Z)$ . We use $\mathbf{H}$ to denote the map that takes a probability distribution over $n$ variables to its entropy vector (with $2^{n}-1$ components) and $\Gamma_{n}^{*}$ to denote the set of all vectors that are entropy vectors of a probability distribution $p_{X_{1},\ldots,X_{n}}$ , i.e., $\Gamma_{n}^{*}=\{v\in\mathbb{R}^{2^{n}-1}:\exists p_{X_{1},\ldots,X_{n}}\text{ s.t. }v=\mathbf{H}(p_{X_{1},\ldots,X_{n}})\}$ . The closure of $\Gamma_{n}^{*}$ , denoted by $\overline{\Gamma_{n}^{*}}$ is known to be a convex set for any $n$ Zhang1997ANC .

II.1 The Shannon cone

Valid entropy vectors necessarily satisfy certain constraints. These include positivity of the entropies, monotonicity (i.e., $H(R)\leq H(RS)$ ) and submodularity (also known as strong-subadditivity; $H(RT)+H(ST)\geq H(RST)+H(T)$ ). Monotonicity and submodularity are equivalent to the positivity of the conditional entropy $H(S|R)$ and the conditional mutual information $I(R:S|T)$ respectively and hold for any three disjoint subsets $R$ , $S$ and $T$ of $\{X_{1},\ldots,X_{n}\}$ . This set of linear constraints are together known as the Shannon constraints and the set of vectors $u\in\mathbb{R}^{2^{n}-1}$ obeying all the Shannon constraints form the convex cone known as the Shannon cone, $\Gamma_{n}$ . Other than positivity (which, following standard practice, we include implicitly), there are a total of $n+n(n-1)2^{n-3}$ independent Shannon constraints for $n$ variables Yeung97 . By definition, the Shannon cone is an outer approximation to $\overline{\Gamma_{n}^{*}}$ i.e., $\overline{\Gamma_{n}^{*}}\subseteq\Gamma_{n}$ .777For $n\leq 3$ , the cones coincide, but for $n\geq 4$ they do not Zhang1997ANC . Hence all entropy vectors derived from a probability distribution $p_{X_{1},\ldots,X_{n}}$ obey the Shannon constraints but not all vectors $u\in\mathbb{R}^{2^{n}-1}$ obeying the Shannon constraints are such that $\mathbf{H}(p_{X_{1},\ldots,X_{n}})=u$ for some joint distribution $p_{X_{1},\ldots,X_{n}}$ .

In the next subsection we discuss how causal structures give additional entropic constraints.

II.2 Entropy vectors and causal structure

A causal structure can be represented as a Directed Acyclic Graph (DAG) over several nodes, some of which are labelled observed and some unobserved. Each observed node corresponds to a classical random variable888These may represent inputs or outputs of an experiment., while for each unobserved node there is an associated system whose nature depends on the theory being considered. A causal structure is called classical (denoted $\mathcal{G}^{\mathrm{C}}$ ), quantum (denoted $\mathcal{G}^{\mathrm{Q}}$ ) or GPT (denoted $\mathcal{G}^{\mathrm{GPT}}$ ) depending on the nature of the unobserved nodes. In the following, we briefly review the framework of classical causal models Pearl2000 .

A distribution $p_{X_{1},\ldots,X_{n}}$ over $n$ random variables $\{X_{1},\ldots,X_{n}\}$ is said to be compatible with a classical causal structure $\mathcal{G}^{\mathrm{C}}$ (with these variables as nodes) if it satisfies the causal Markov condition i.e., the joint distribution decomposes as

[TABLE]

where $X_{i}^{\downarrow_{1}}$ denotes the set of all parent nodes of the node $X_{i}$ in the DAG $\mathcal{G}^{\mathrm{C}}$ . The Markov condition of Equation (1) is equivalent to the conditional independence of $X_{i}$ from its non-descendants, denoted $X_{i}^{\nuparrow}$ given its parents $X_{i}^{\downarrow_{1}}$ in $\mathcal{G}$ i.e., $\forall i\in\{1,\ldots,n\}$ , $p_{X_{i}X_{i}^{\nuparrow}|X_{i}^{\downarrow_{1}}}=p_{X_{i}|X_{i}^{\downarrow_{1}}}p_{X_{i}^{\nuparrow}|X_{i}^{\downarrow_{1}}}$ Pearl2000 . All other conditional independences between different subsets of nodes are implied by these $n$ constraints and can be derived from these constraints and standard probability calculus based on Bayes’ rule. The concept of d-separation developed by Geiger Geiger1987 and Verma and Pearl Verma2013 provides a method to read off implied conditional independence relations from the graph. In other words, for arbitrary disjoint subsets $X$ , $Y$ and $Z$ of the nodes, it can be used to determine whether $X$ and $Y$ are conditionally independent given $Z$ .

Definition 1 (Blocked paths).

Let $\mathcal{G}$ be a DAG in which $X$ and $Y\neq X$ are nodes and $Z$ be a set of nodes not containing $X$ or $Y$ . A path from $X$ to $Y$ is said to be blocked by $Z$ if it contains either $A\rightarrow W\rightarrow B$ with $W\in Z$ , $A\leftarrow W\rightarrow B$ with $W\in Z$ or $A\rightarrow W\leftarrow B$ such that neither $W$ nor any descendant of $W$ belongs to $Z$ .

Definition 2 (d-separation).

Let $\mathcal{G}$ be a DAG in which $X$ , $Y$ and $Z$ are disjoint sets of nodes. $X$ and $Y$ are d-separated by $Z$ in $\mathcal{G}$ if every path from a variable in $X$ to a variable in $Y$ is blocked by $Z$ .

The importance of d-separation is that, given a causal structure $\mathcal{G}$ , $X$ and $Y$ are d-separated by $Z$ in $\mathcal{G}$ if and only if $I(X:Y|Z)=0$ for all distributions compatible with $\mathcal{G}$ Pearl2000 .999Note that $I(X:Y|Z)=0$ is equivalent to $P_{XY|Z}=P_{X|Z}P_{Y|Z}$ . The complete set of d-separation conditions give all the conditional independence relations implied by the DAG. In the case of Shannon entropy for a DAG with $n$ nodes these are all implied by the $n$ constraints

[TABLE]

In other words, a distribution over $n$ variables satisfies Equation (1) if and only if it satisfies Equation (2).

Since we wish to contrast classical and quantum versions of causal structures we also define the latter. For the purpose of this work, it is sufficient to do so for causal structures with at most two generations and in which the first generation can be either observed classical random variables or unobserved quantum nodes, while those of the second generation are only observed classical variables (in Appendix A we also look at a case in which the second generation can be quantum). Each edge emanating from an unobserved node has an associated Hilbert space labelled by the parent and the child. For example an edge from an unobserved node $X$ to an observed node $Y$ has the associated Hilbert space $\mathcal{H}_{X_{Y}}$ . Each unobserved quantum node corresponds to a density operator in the tensor product of the Hilbert space corresponding to all the edges emanating from that node. For each observed node, there is a POVM that acts on the tensor product of the Hilbert spaces associated with the edges that meet at that node. The set of distributions over observed nodes compatible with the quantum causal structure $\mathcal{G}^{Q}$ are those that can be obtained by performing the specified POVMs (possibly specified by classical input nodes in the first generation) on the relevant quantum states and using the Born rule. For instance, a distribution $P_{ABXY}$ is compatible with the quantum analogue of Figure 1a if there exists a quantum state $\rho\in\mathcal{H}_{\Lambda_{X}}\otimes\mathcal{H}_{\Lambda_{Y}}$ and POVMs $\{E^{a}_{x}\}_{x}$ and $\{F^{b}_{y}\}_{y}$ acting on $\mathcal{H}_{\Lambda_{X}}$ and $\mathcal{H}_{\Lambda_{Y}}$ respectively, such that $P_{ABXY}(a,b,x,y)=P_{A}(a)P_{B}(b)\operatorname{Tr}(\rho(E^{a}_{x}\otimes F^{b}_{y}))$ for all values of the random variables.

Now, in the case of classical causal structures with unobserved nodes, the compatibility condition requires that there exists a joint distribution $p_{X_{1},\ldots,X_{n}}$ over the $n$ variables satisfying the causal Markov condition and having the correct marginals over the observed nodes. In quantum and more general theories, the existence of a joint state over all the nodes is not guaranteed because there may be sets of systems that do not coexist. (For example, there is no joint quantum state of a system and the outcome of a measurement on it.) Because classical information can be copied, such joint distributions always exist in the classical case. The entropy vector method aims to exploit this difference to certify the non-classicality of correlations.

The entropic constraints over all the nodes will in general imply constraints on the entropy vector over the observed nodes. These can be obtained by Fourier-Motzkin elimination Williams1986 . The procedure takes the entropy cone over all nodes, that is constrained by the $n+n(n-1)2^{n-3}$ Shannon constraints and the $n$ causal constraints (Equation (2)) and projects it onto the entropy cone of the observed nodes (eliminating all combinations of entropies involving unobserved nodes). Since non-classical causal structures do not satisfy the initial assumption of the existence of the joint distribution/entropies, they may give rise to correlations that do not satisfy the marginal constraints on the observed nodes obtained through this procedure. A violation of one of the inequalities certifies the non-classicality of that causal structure.

For line-like causal structures (of which the bipartite Bell causal structure of Figure 1a is an instance), the classical and quantum Shannon entropy cones coincide and Shannon entropic inequalities cannot certify the non-classicality of these causal structures even though they support non-classical correlations Weilenmann16 . Further, in other scenarios such as the triangle which is also known to support non-classical correlations Fritz2012 , known Shannon entropic inequalities such as those of Chaves2014 ; Weilenmann2018 have no known quantum violations. The main question of the current work is whether using Tsallis entropies can provide tighter, quantum violatable entropic inequalities and avoid these limitations.

III Tsallis entropies

For a classical random variable $X$ distributed according to the discrete probability distribution $p_{X}$ , the order $q$ Tsallis entropy of $X$ for real parameter $q$ is defined as Tsallis1988

[TABLE]

where we have used the short-hand $\ln_{q}p_{x}=\frac{p_{x}^{1-q}-1}{1-q}$ . This $q$ -logarithm function converges to the natural logarithm in the limit $q\rightarrow 1$ so that $\lim\limits_{q\rightarrow 1}S_{q}(X)=H(X)$ and the function is continuous in $q$ . For brevity, we will henceforth write $\sum_{x}$ instead of $\sum_{\{x:p_{x}>0\}}$ , keeping it implicit that probability zero events do not contribute to the sum.101010Note that this means the Tsallis entropy for $q<0$ is not robust in the sense that small changes in the probability distribution can lead to large changes in the Tsallis entropy.

The conditional Tsallis entropy Furuichi04 is defined by

[TABLE]

and converges to the Shannon conditional entropy $H(X|Y)$ in the limit $q\rightarrow 1$ . Note that there are other ways to define the conditional Tsallis entropy ABE2001157 but they do not satisfy the chain rule (Equation (9)) and hence will not be considered here.

The unconditional and conditional Tsallis mutual informations are defined analogously to the Shannon case

[TABLE]

III.1 Properties of Tsallis entropies

Tsallis entropies satisfy a number of properties that are desirable for their use in the entropy vector method. For any joint distribution over the random variables involved the following properties hold.

Pseudo-additivity Curado1991 : For two independent random variables $X$ and $Y$ i.e., $p_{XY}=p_{X}p_{Y}$ , and for all $q$ , the Tsallis entropies satisfy

[TABLE]

Note that in the Shannon case ( $q=1$ ), we recover additivity for independent random variables. 2. 2.

Upper bound Furuichirelentropy : For $q\geq 0$ we have $S_{q}(X)\leq\ln_{q}d_{X}$ . For $q>0$ equality is achieved if and only if $P_{X}(x)=1/d_{X}$ for all $x$ (i.e., if the distribution on $X$ is uniform). 3. 3.

Monotonicity Daroczy1970 : For all $q$ ,

[TABLE] 4. 4.

Strong subadditivity Furuichi04 : For $q\geq 1$ ,

[TABLE] 5. 5.

Chain rule Furuichi04 : For all $q$ ,

[TABLE]

The chain rules $S_{q}(XY)=S_{q}(X)+S_{q}(Y|X)$ and $S_{q}(XY|Z)=S_{q}(X|Z)+S_{q}(Y|XZ)$ emerge as particular cases and allow the Tsallis mutual informations of Equation (5) to be written as

[TABLE]

Using the chain rule, the monotonicity and strong subadditivity relations (Equations (7) and (8)) are equivalent to the non-negativity of the unconditional and conditional Tsallis mutual informations. For $q<1$ , strong subadditivity does not hold in general Furuichi04 , hence we often restrict to the case $q\geq 1$ in what follows.

IV Causal constraints and Tsallis entropy vectors

In Section III.1, we discussed some of the general properties of Tsallis entropy that hold irrespective of the underlying causal structure over the variables. The causal structure imposes the causal Markov constraints on the joint probability distribution over the variables involved (Section II.2) and we wish to translate these probabilistic constraints into Tsallis entropic ones in order to use Tsallis entropies in the entropy vector method for analysing causal structures.

A first observation is that Tsallis entropy vectors do not in general satisfy the causal constraints (Equation (2)) satisfied by their Shannon counterparts. For a concrete counterexample, consider the simple, three variable causal structure where $Z$ is a common cause of $X$ and $Y$ , and where there are no other causal relations. In terms of Shannon entropies, the only causal constraint in this case is $I(X:Y|Z)=0$ . Taking $X,Y$ and $Z$ to be binary variables with possible values [math] and $1$ , the distribution $p_{xyz}=1/4$ $\forall x\in X,y\in Y$ if $z=0$ and $p_{xyz}=0$ otherwise, satisfies $p_{xy|z}=p_{x|z}p_{y|z}$ $\forall x\in X,y\in Y$ and $z\in Z$ but has a $q=2$ Tsallis conditional mutual information of $I_{2}(X:Y|Z)=\frac{1}{4}$ . Hence when using Tsallis entropies (and conditional Tsallis entropy as defined in Section III), the causal constraint cannot be simply encoded by $I_{q}(X:Y|Z)=0$ for $q>1$ .

Given this observation, it is natural to ask whether there are constraints for Tsallis entropies implied by the causal Markov condition (Equation (1)). We answer this question with the following Theorems.

Theorem 1.

If a joint probability distribution $p_{XY}$ over random variables $X$ and $Y$ with alphabet sizes $d_{X}$ and $d_{Y}$ factorises as $p_{XY}=p_{X}p_{Y}$ , then for all $q\in[0,\infty)$ , the Tsallis mutual information $I_{q}(X:Y)$ is upper bounded by

[TABLE]

where the function $f(q,d_{X},d_{Y})$ is given by

[TABLE]

For $q\in(0,\infty)\setminus\{1\}$ , the bound is saturated if and only if $p_{XY}$ is the uniform distribution over $X$ and $Y$ .

Proof.

The proof follows from the pseudo-additivity of Tsallis entropies (Property 1) and the upper bound (Property 2). Using these, for all $q\geq 0$ and for all product distributions $p_{XY}=p_{X}p_{Y}$ , we have

[TABLE]

Whenever $q\in(0,\infty)\setminus\{1\}$ , the bound is saturated if and only if $p_{XY}$ is uniform over $X$ and $Y$ since, for these values of $q$ , $S_{q}(X)$ and $S_{q}(Y)$ both attain their maximum values if and only if this is the case. ∎

Theorem 2.

If a joint probability distribution $p_{XYZ}$ satisfies the conditional independence $p_{XY|Z}=p_{X|Z}p_{Y|Z}$ , then for all $q\geq 1$ the Tsallis conditional mutual information $I_{q}(X:Y|Z)$ is upper bounded by

[TABLE]

For $q>1$ , the bound is saturated only by distributions in which for some fixed value $k$ the joint probabilities are given by $p_{xyz}=\begin{cases}\frac{1}{d_{X}d_{Y}}\quad\text{if}\quad z=k\\ 0\quad\qquad\text{otherwise}\end{cases}$ for all $x$ , $y$ and $z$ 111111These distributions have deterministic $Z$ and there is one such distribution for each value that $Z$ can take..

Proof.

Writing out $I_{q}(X:Y|Z)$ in terms of probabilities we have

[TABLE]

Using this and Theorem 1, we can bound $I_{q}(X:Y|Z)$ as

[TABLE]

The last step holds because for all $q>1$ , $\sum_{z}p_{z}^{q}$ is maximized by deterministic distributions over $Z$ with a maximum value of $1$ i.e., only distributions $p_{XYZ}$ that are deterministic over $Z$ saturate the upper bound of $f(q,d_{X},d_{Y})$ . This completes the proof. ∎

Two corollaries of Theorem 2 naturally follow.

Corollary 3.

Let $X$ , $Y$ and $Z$ be random variables with fixed alphabet sizes. Then for all $q\geq 1$ we have

[TABLE]

Furthermore, for $q>1$ , the maximum on the left hand side is achieved only by distributions in which for some fixed value $k$ the joint probabilities are given by $p_{xyz}=\begin{cases}\frac{1}{d_{X}d_{Y}}\quad\text{if}\quad z=k\\ 0\quad\qquad\text{otherwise}\end{cases}$ , while the maximum on the right hand side occurs if and only if $P_{XY}$ is the uniform distribution.

The significance of these new relations for causal structures is then given by the following corollary.

Corollary 4.

Let $p_{X_{1}\ldots X_{n}}$ be a distribution compatible with the classical causal structure $\mathcal{G}^{\mathrm{C}}$ and $X$ , $Y$ and $Z$ be disjoint subsets of $\{X_{1},\ldots,X_{n}\}$ such that $X$ and $Y$ are d-separated given $Z$ . Then for all $q\geq 1$ we have

[TABLE]

where $d_{X}$ is the product of $d_{X_{i}}$ for all $X_{i}\in X$ , and likewise for $d_{Y}$ .

Remark 1.

The results of this section can be generalised to the quantum case under certain assumptions i.e., as constraints on quantum Tsallis entropies implied by certain quantum causal structures (see the Appendix A for details). Note that only constraints on the classical Tsallis entropy vectors derived in this section are required to detect the classical-quantum gap. Hence, Appendix A is not pertinent to the main results of this paper but can be seen as additional results regarding the properties of quantum Tsallis entropies.

IV.1 Number of independent Tsallis entropic causal constraints

We saw previously that in the Shannon case ( $q=1$ ), the $n$ conditions of the form $I(X_{i}:X_{i}^{\nuparrow}|X_{i}^{\downarrow_{1}})=0$ ( $i=1,\ldots,n$ ) imply all the independence relations that follow from the causal structure. In the Tsallis case however, the $n$ conditions of the form $I_{q}(X_{i}:X_{i}^{\nuparrow}|X_{i}^{\downarrow_{1}})\leq f(q,d_{X_{i}},d_{X_{i}^{\nuparrow}})$ do not do the same. In the bipartite Bell and triangle causal structures we find that there is no redundancy amongst the 53 and 126 distinct Tsallis entropic inequalities that are implied by the d-separation relations in the corresponding DAGs in the case where the dimension (cardinality) of each individual node is taken to be $d$ . In more detail, we used linear programming to show that each implication of d-separation yields a non-trivial entropic causal constraint for all $q>1$ and $d>2$ for the bipartite Bell and triangle causal structures. By comparison, in these causal structures five and six independent Shannon entropic constraints imply all the others. As an illustration of the difference, in the Shannon case, $I(A:BC)=0$ implies $I(A:B)=I(A:C)=0$ , whereas the analogous implication does not hold in the Tsallis case in general: although $I_{q}(A:BC)\leq f(q,d_{A},d_{BC})$ implies $I_{q}(A:B)\leq f(q,d_{A},d_{BC})$ , it is not the case that $I_{q}(A:BC)\leq f(q,d_{A},d_{BC})$ implies $I_{q}(A:B)\leq f(q,d_{A},d_{B})$ .121212For an explicit counterexample, consider $p_{ABC}=\{\frac{3}{10},0.0,\frac{2}{10},0.0,\frac{1}{10},\frac{1}{10},\frac{2}{10},\frac{1}{10}\}$ over binary $A$ , $B$ and $C$ for which $I_{2}(A:BC)=9/25<3/8=f(2,2,4)$ but $I_{2}(A:B)=13/50>1/4=f(2,2,2)$ .

The number of distinct conditional independences (and hence the number of independent Tsallis constraints that follow from d-separation) in a DAG depends on the specific graph, however for any DAG $\mathcal{G}_{n}$ with $n$ nodes, the number of such constraints can be upper bounded by that of the $n$ -node DAG where all $n$ nodes are independent i.e., the $n$ node DAG with no edges. The number of conditions in this DAG can be thought of as the number of ways of partitioning $n$ objects into four disjoint subsets131313The four subsets correspond to the three arguments of the conditional mutual information and a set of ‘leftovers’. such that the first two are non-empty and where the ordering of the first two does not matter. Therefore, there are at most $\frac{1}{2}(4^{n}-2\times 3^{n}+2^{n})$ such conditions.

IV.2 Using Tsallis entropies in the entropy vector method

We used the causal constraints of Corollary 4 in the entropy vector method with the aim of deriving new quantum-violatable entropic inequalities for the triangle causal structure (Figure 1b). To do so, we started with the variables $A,B,C,X,Y,Z$ of the triangle causal structure, the Shannon constraints and causal constraints satisfied by the Tsallis entropy vectors over these variables (Corollary 4) and used a Fourier-Motzkin (FM) elimination algorithm (from porta141414Polyhedral Representation and Transformation Algorithm: http://porta.zib.de/.) to eliminate the Tsallis entropy components involving the unobserved variables $A,B,C$ and obtain the constraints on the observed nodes $X,Y,Z$ .

The Tsallis entropy vector for the six nodes has $2^{6}-1=63$ components. The required marginal scenario with the observed nodes $X,Y,Z$ has Tsallis entropy vectors with $2^{3}-1=7$ components and in this case, the Fourier-Motzkin algorithm has to run $56$ iterations, each of which eliminates one variable.

Starting with the full set of 126 Tsallis entropic causal constraints for the triangle causal structure as well as the 246 independent Shannon constraints, the Fourier-Motzkin elimination algorithm did not finish within several days on a standard desktop PC and the number of intermediate inequalities generated grew to about 90,000 after 11 steps. Because of this we instead tried starting with a subset comprising 15 of the 126 Tsallis entropic causal constraints151515These included the 6 that follow from “each node $N_{i}$ is conditionally independent of its descendants given its parents” (denoted as $N_{i}\perp N_{i}^{\nuparrow}|N_{i}^{\downarrow_{1}}$ ) and 9 more chosen arbitrarily from the total of 126 independent Tsallis constraints we found for the triangle. The 6 former constraints for the triangle (Figure 1b) are $A\perp CXB$ , $B\perp CYA$ , $C\perp BZA$ , $X\perp YAZ|CB$ , $Y\perp XBZ|AC$ and $Z\perp YCX|AB$ . An example of 9 more constraints for which the procedure did not work are $X\perp Y|CB$ , $X\perp A|CB$ , $X\perp Z|CB$ , $Y\perp X|AC$ , $Y\perp B|AC$ , $Y\perp Z|AC$ , $Z\perp Y|AC$ , $Z\perp C|AB$ and $Z\perp X|AB$ . We also tried some other choices and number of constraints but this did not lead to any improvement. i.e., 261 constraints on 63 dimensional vectors. We considered the case of $q=2$ and where the six random variables are all binary. Again, in this case the algorithm did not finish after several days. We also tried starting with fewer causal constraints (for example, the six constraints analogous to the Shannon case) as well as using a modified code, optimised to deal with redundancies better but both of these attempts made no significant difference to this outcome.

Such a rapid increase of the number of inequalities in each step is a known problem with Fourier-Motzkin elimination where an elimination step over $n$ inequalities can result in up to $n^{2}/4$ inequalities in the output and running $d$ successive elimination steps can yield a double exponential complexity of $4(n/4)^{2^{d}}$ Williams1986 . This rate of increase can be kept under control when the resulting set of inequalities has many redundancies. This happens in the Shannon case where the causal constraints are simple equalities and the system of 246 Shannon constraints plus 6 Shannon entropic causal constraints reduces to a system of just 91 independent inequalities before the FM elimination. In the Tsallis case, no reduction of the system of inequalities is possible in general due to the nature of the causal constraints. The fact that the Tsallis entropic causal constraints are inequality constraints rather than equalities also contributes to the computational difficulty since each independent equality constraint in effect reduces the dimension of the problem by 1.

We also tried the same procedure on the bipartite Bell causal structure (Figure 1a), again for $q=2$ and binary variables. Here, starting with the full set of 53 causal constraints, again resulted in the program running for over a week without nearing the end, and a similar result was obtained when starting only with 8–10 causal constraints. While starting with fewer causal constraints such as the 5 conditional independence constraints (one for each node) resulted in a terminating program, no non-trivial entropic inequalities were obtained (i.e., we only obtained constraints corresponding to Shannon constraints or causal constraints that follow directly from $d$ separation).161616For example, we were able to obtain $I_{2}(A:BY)\leq\frac{7}{16}$ and $I_{2}(B:AX)\leq\frac{7}{16}$ , while, in the case of binary variables and $q=2$ , the independences in the DAG together with Theorem 1 imply $I_{2}(A:BY)\leq\frac{6}{16}$ and $I_{2}(B:AX)\leq\frac{6}{16}$ , which are the Tsallis entropic equivalents of the two non-signalling constraints.

V New Tsallis entropic inequalities for the triangle causal structure

Despite the limitations encountered in applying the entropy vector method to Tsallis entropies (Section IV.2), here we find new Tsallis entropic inequalities for the triangle causal structure for all $q\geq 1$ by using known inequalities for the Shannon entropy Chaves2014 and the causal constraints derived in Section IV. Using the entropy vector method for Shannon entropies, the following three classes of entropic inequalities were obtained for the triangle causal structure (Figure 1b) in Chaves2014 171717Note that a tighter entropic characterization was found in Weilenmann2018 based on non-Shannon inequalities, and that the techniques introduced here could also be applied to these.. Including all permutations of $X$ , $Y$ and $Z$ , these yield 7 inequalities.

[TABLE]

By replacing the Shannon entropy $H()$ with the Tsallis entropy $S_{q}()$ on the left hand side of these inequalities and minimizing the resultant expression over our outer approximation to the classical Tsallis entropy cone for the triangle causal structure, one can obtain valid Tsallis entropic inequalities for this causal structure. More precisely, the outer approximation to the classical Tsallis entropy cone for the triangle is characterised by the $6+6(6-1)2^{6-3}=246$ independent Shannon constraints (monotonicity and strong subadditivity constraints) and the $126$ causal constraints (one for each conditional independence implied by the causal structure). To perform this minimization we used LPAssumptions LPAssumptions , a linear program solver in Mathematica that implements the simplex method allowing for unspecified variables. In our case, we assumed that the dimensions of all the unobserved nodes ( $A$ , $B$ and $C$ ) are equal to $d_{u}$ and those of all the observed nodes ( $X$ , $Y$ and $Z$ ) is $d_{o}$ , and so the unspecified variables are $q\geq 1$ , $d_{u}\geq 2$ and $d_{o}\geq 2$ . We obtained the following Tsallis entropic inequalities for the triangle.

[TABLE]

where,

[TABLE]

Note that $\lim_{q\rightarrow 1}B_{1}=\lim_{q\rightarrow 1}B_{2}=\lim_{q\rightarrow 1}B_{3}=0$ $\forall d_{u},d_{o}\geq 2$ , recovering the original inequalities for Shannon entropies (Equations (12a)–(12c)) as a special case.

In Rosset2017 , an upper bound on the dimensions of classical unobserved systems needed to reproduce a set of observed correlations is derived in terms of the dimensions of the observed systems. In the case of the triangle causal structure with $d_{X}=d_{Y}=d_{Z}=d_{o}$ and $d_{A}=d_{B}=d_{C}=d_{u}$ as considered here, the result of Rosset2017 implies that all classical correlations $P_{XYZ}$ can be reproduced by using hidden systems of dimension at most $d_{o}^{3}-d_{o}$ . Since the dimension of the unobserved systems is unknown, it makes sense to take the minimum of the derived bounds over all $d_{u}$ between $2$ and $d_{o}^{3}-d_{o}$ . By taking their derivative, one can verify that for $q>1$ each of the functions $B_{1}$ , $B_{21}$ , $B_{22}$ and $B_{3}$ is monotonically decreasing in $d_{o}$ and $d_{u}$ , and hence that the minimum is obtained for $d_{u}=d_{o}^{3}-d_{o}$ for any given $d_{o}\geq 2$ . It follows that for all $q>1$ and $d_{o}\geq 2$ relations of the same form as Equations (13a)–(13c) hold, with the quantities on the right hand sides replaced by

[TABLE]

A quantum violation of any of these bounds would imply that no unobserved classical systems of arbitrary dimension could reproduce those quantum correlations.

Remark 2.

Because they are monotonically decreasing, the bounds for $d_{u}=d_{o}^{3}-d_{o}$ are not as tight as the $d_{u}$ -dependent bounds for general $q>1$ . Nevertheless, as $q\to 1$ , all the bounds $B^{*}(q,d_{o})$ tend to 0, reproducing the known result of Fritz13 for the Shannon case.

Remark 3.

In some cases it may be interesting to show quantum violations of these inequalities for low values of $d_{u}$ , hence ruling out classical explanations with hidden systems of low dimensions, while possibly leaving open the case of arbitrary classical explanations. This would be interesting if it could be established that using hidden quantum systems allows for much lower dimensions than for hidden classical systems, for example.

V.1 Looking for quantum violations

It is known that the triangle causal structure (Figure 1b) admits non-classical correlations such as Fritz’s distribution Fritz2012 . The idea behind this distribution is to embed the CHSH game in the triangle causal structure such that non-locality for the triangle follows from the non-locality of the CHSH game. To do so, $C$ is replaced by the sharing of a maximally entangled pair of qubits, and $A$ and $B$ are taken to be uniformly random classical bits. The observed variables $X$ , $Y$ and $Z$ in Figure 1b are taken to be pairs of the form $X:=(\tilde{X},B)$ , $Y:=(\tilde{Y},A)$ and $Z:=(A,B)$ , where $\tilde{X}$ and $\tilde{Y}$ are generated by measurements on the halves of the entangled pair with $B$ and $A$ used to choose the settings such that the joint distribution $P_{\tilde{X}\tilde{Y}|BA}$ maximally violates a CHSH inequality. By a similar post-processing of other non-local distributions in the bipartite Bell causal structure (Figure 1a) such as the Mermin-Peres magic square game Mermin1990 ; Peres1990 and chained Bell inequalities BraunsteinCaves88 , one can obtain other non-local distributions in the triangle that cannot be reproduced using classical systems. We explore whether any of these violate any of our new inequalities.

Since the values of $B_{i}(q,d_{o},d_{u})$ are monotonically decreasing in $d_{o}$ and $d_{u}$ , if a distribution realisable in a quantum causal structure does not violate the bounds (13a)–(13c) for all $q\geq 1$ and some fixed values of $d_{o}$ and $d_{u}$ , then no violations are possible for $d_{o}^{\prime}>d_{o}$ , $d_{u}^{\prime}>d_{u}$ . We therefore take the smallest possible values of $d_{o}$ and $d_{u}$ when showing that a particular distribution cannot violate any of the bounds.

For Fritz’s distribution Fritz2012 , $C$ is a two-qubit maximally entangled state, $A$ and $B$ are binary random variables while $X$ , $Y$ and $Z$ are random variables of dimension 4, i.e., the actual observed dimensions are $(d_{X},d_{Y},d_{Z})=(4,4,4)$ in this case. Here we see that taking $d_{o}=4$ and the smallest possible $d_{u}$ which is $d_{u}=2$ , the left hand sides of Equations (13a)–(13c) evaluated for Fritz’s distribution do not violate the corresponding bounds $B_{i}(q,d_{o}=4,d_{u}=2)$ for any $q\geq 1$ . This means that it is not possible to detect any quantum advantage of this distribution (even over the case where the unobserved systems are classical bits) using this method, and automatically implies that it cannot violate the bounds $B_{i}(q,d_{o}=4,d_{u})$ for $d_{u}\geq 2$ .

We also considered the chained Bell and magic square correlations embedded in the triangle causal structure analogously to the case discussed above. For each of these, we define $d^{i}$ to be the smallest value of $d_{o}$ for which the bound $B_{i}(q,d_{o}=d^{i},d_{u}=2)$ cannot be violated for any $q>1$ . The values of $d^{i}$ are given in Table 1 for the different cases of the chained Bell correlations and the magic square. Since the values of $d^{i}$ are always lower than the smallest of the observed dimensions in the problem, and due to the monotonicity of the bounds it follows that none of these quantum distributions violate any of our inequalities when the observed dimension is set to $d_{o}^{\min}$ .

We further checked for violations of Inequalities (13a)–(13c) by sampling random quantum states for the systems $A$ , $B$ and $C$ and random quantum measurements whose outcomes would correspond to the classical variables $X$ , $Y$ and $Z$ . The value of $q$ was also sampled randomly between $1$ and $100$ . We considered the cases where the shared systems were pairs of qubits with 4 outcome measurements ( $d_{X}=d_{Y}=d_{Z}=4$ ) and qutrits with 9 outcome measurements ( $d_{X}=d_{Y}=d_{Z}=9$ ) but were unable to find violations of any of the inequalities even for the bounds with the $d_{o}=4,d_{u}=2$ (two qubit case) and $d_{o}=9,d_{u}=2$ (two qutrit case), i.e., the bounds obtained when the unobserved systems are classical bits.

Remark 4.

In the derivation of Inequalities (13a)–(13c), we set the dimensions of the observed nodes $X$ , $Y$ and $Z$ to all be equal and those of the unobserved nodes $A$ , $B$ and $C$ to also all be equal. One could in principle repeat the same procedure taking different dimensions for all 6 variables but we found the computational procedure too demanding. However, Table 1 shows that even when we consider the bounds $B_{i}(q,d_{o},d_{u})$ with $d_{o}$ and $d_{u}$ much smaller than the actual dimensions, known non-local distributions in the triangle considered in Table 1 do not violate the corresponding Inequalities (13a)–(13c) for any $q\geq 1$ . Since the bounds are monotonically decreasing in $d_{u}$ and $d_{o}$ , even if we obtained the general bounds for arbitrary dimensions of $X$ , $Y$ , $Z$ , $A$ , $B$ and $C$ , they would be strictly weaker than $B_{i}(q,d^{i},d_{u}=2)$ $\forall i\in\{1,2,3\},q\geq 1$ and can certainly not be violated by these distributions.

VI Discussion

We have investigated the use of Tsallis entropies within the entropy vector method to causal structures, showing how causal constraints imply bounds on the Tsallis entropies of the variables involved. Although Tsallis entropies for $q\geq 1$ possess many properties that aid their use in the entropy vector method, the nature of the causal constraints makes the problem significantly more computationally challenging than in the case of Shannon entropy. This meant that we were unable to complete the desired computations in the former case, even for some of the simplest causal structures. Nevertheless, we were able to derive new classical causal constraints expressed in terms of Tsallis entropy by analogy with known Shannon constraints, but were unable to find cases where these were violated, even using quantum distributions that are known not to be classically realisable. This mirrors an analogous result for Shannon entropies Weilenmann2018 .

Tsallis entropies are known to give improvements Wajs15 in cases that involve post-selection. While post-selection cannot be used for general causal structures (including the triangle), it would be interesting to understand whether using Tsallis entropy helps in other cases for which post-selection is applicable.

One could also investigate whether other entropic quantities could be used in a similar way. The Rényi entropies of order $\alpha$ do not satisfy strong subadditivity for $\alpha\neq 0,1$ , while the Rényi as well as the min and max entropies fail to obey the chain rules for conditional entropies. Thus, use of these in the entropy vector method, would require an entropy vector with components for all possible conditional entropies as well as unconditional ones, considerably increasing the dimensionality of the problem, which we would expect to make the computations harder.181818In some cases, not having a chain rule may not be prohibitive WeilenmannGPT .

Further, one could consider using algorithms other than Fourier-Motzkin elimination to obtain non-trivial Tsallis entropic constraints over observed nodes starting from the Tsallis cone over all the nodes (see e.g., Gl_le_2018 ). These could in principle yield solutions even in cases where FM elimination becomes intractable. However, we found that the FM elimination procedure became intractable even when starting out with only a small subset of the Tsallis entropic causal constraints for a simple causal structure such as the Bell one. This suggests that the difficulty is not only with the number of constraints, but also with their nature (in particular, that they are not equalities and depend non-trivially on the dimensions). Consequently, we bypassed FM elimination and used an alternative technique to obtain new Tsallis entropic inequalities for the Triangle causal structure (Section V).

It is also worth noting that the following alternative definition of the Tsallis conditional entropy was proposed in ABE2001157 .

[TABLE]

Using this definition, Tsallis entropies would satisfy the same causal constraints as the Shannon entropy (Equation (2)). However, the conditional entropies defined this way do not satisfy the chain rules of Equation (9) but instead obey a non-linear chain rule, $S_{q}(XY)=S_{q}(X)+S_{q}(Y|X)+(1-q)S_{q}(X)S_{q}(Y|X)$ ABE2001157 . This would again mean that conditional entropies would need to be included in the entropy vector. Furthermore, since Fourier-Motzkin elimination only works for linear constraints, an alternative algorithm would be required to use this chain rule in conjunction with the entropy vector method.

That the inequalities for Tsallis entropy derived in this work depend on the dimensions of the systems involved could be used to certify that particular observed correlations in a classical causal structure require a certain minimal dimension of unobserved systems to be realisable. To show this would require showing that classically-realisable correlations violate one of the inequalities for some $d_{u}$ . Such bounds would then complement the upper bounds of Rosset2017 . However, in some cases we know our bounds are not tight enough to do this. As a simple example, within the triangle causal structure we tried taking $X=(X_{B},X_{C})$ , $Y=(Y_{A},Y_{C})$ and $Z=(Z_{A},Z_{B})$ with $X_{B}=Z_{B}$ , $X_{C}=Y_{C}$ and $Y_{A}=Z_{A}$ where each are uniformly distributed with cardinality $D$ , for $D\in\{3,\ldots,10\}$ . In this case it is clear that the correlations cannot be achieved with classical unobserved systems with $d_{u}=2$ . Taking the bound with $d_{u}=2$ and $d_{o}=D^{2}$ no violations of (13a)–(13c) were seen by plotting the graphs for $q\in[1,20]$ , for the range of $D$ above. Hence, our bounds are too loose to certify lower bounds on $d_{u}$ in this case.

While our analysis highlights significant drawbacks of using Tsallis entropies for analysing causal structures, it does not rule out the possibility of Tsallis entropies being able to detect the classical-quantum gap191919Proving that Tsallis entropies are unable to do this would also be difficult. For instance, the proof of Weilenmann16 that Shannon entropies are unable to detect the gap in line-like causal structures involves first characterising the marginal polytope through Fourier-Motzkin elimination, which itself proved to be computationally infeasible with Tsallis entropies even for the simplest line-like causal structure, the bipartite Bell scenario. in these causal structures, or others. To overcome the difficulties we encountered we would either need increased computational power, or the use of new, alternative techniques for analysing causal structures (with or without entropies).

Acknowledgements.

We thank Mirjam Weilenmann and Elie Wolfe for useful discussions and also thank Mirjam Weilenmann for sharing some Mathematica code. VV acknowledges financial support from the Department of Mathematics, University of York. RC is supported by EPSRC’s Quantum Communications Hub (grant number EP/M013472/1) and by an EPSRC First Grant (grant number EP/P016588/1).

Appendix A Quantum generalisations of Theorems 1 and 2

In the following, for a (finite dimensional) Hilbert space $\mathcal{H}$ , we use $\mathcal{L}(\mathcal{H})$ to represent the set of linear operators on $\mathcal{H}$ , $\mathcal{P}(\mathcal{H})$ to represent the set of positive (semi-definite) operators on $\mathcal{H}$ , and $\mathcal{S}(\mathcal{H})$ to denote the set of density operators on $\mathcal{H}$ (positive and trace 1).

Tsallis entropies as defined for classical random variables in Section III are easily generalised to the quantum case by replacing the probability distribution by a density matrix Hu2006 . For a quantum system described by the density matrix $\rho\in\mathcal{S}(\mathcal{H})$ on the Hilbert space $\mathcal{H}$ and $q>0$ , the quantum Tsallis entropy is defined by

[TABLE]

where $H(\rho)=-\operatorname{Tr}\rho\ln\rho$ is the von-Neumann entropy of $\rho$ and $\ln_{q}(x)=\frac{x^{1-q}-1}{1-q}$ as in Section III.202020Analogously to the classical case we keep it implicit that if $\rho$ has any 0 eigenvalues these do not contribute to the trace.

Given a density operator $\rho_{AB}\in\mathcal{S}(\mathcal{H}_{A}\otimes\mathcal{H}_{B})$ , the conditional quantum Tsallis entropy of $A$ given $B$ can then be defined by $S_{q}(A|B)_{\rho}=S_{q}(AB)-S_{q}(B)$ , the mutual information between $A$ and $B$ by $I_{q}(A:B)_{\rho}=S_{q}(A)+S_{q}(B)-S_{q}(AB)$ , and for $\rho_{ABC}\in\mathcal{S}(\mathcal{H}_{A}\otimes\mathcal{H}_{B}\otimes\mathcal{H}_{C})$ the conditional Tsallis information between $A$ and $B$ given $C$ is defined by $I_{q}(A:B|C)_{\rho}=S_{q}(A|C)+S_{q}(B|C)-S_{q}(AB|C)$ . In this section we use $d_{S}$ to represent the dimensions of the Hilbert space $\mathcal{H}_{S}$ .

The following properties of quantum Tsallis entropies will be useful for what follows.

Pseudo-additivity Tsallis1988 : If $\rho_{AB}=\rho_{A}\otimes\rho_{B}$ , then

[TABLE] 2. 2.

Upper bound Audenaert2007 : For all $q>0$ , we have $S_{q}(A)\leq\ln_{q}d_{A}$ and equality is achieved if and only if $\rho_{A}=\openone_{A}/d_{A}$ . 3. 3.

Subadditivity Audenaert2007 : For any density matrix $\rho_{AB}$ with marginals $\rho_{A}$ and $\rho_{B}$ , the following holds for all $q\geq 1$ ,

[TABLE]

Using these we can generalize Theorem 1 to the quantum case. This corresponds to the causal structure with two independent quantum nodes and no edges in between them.

Theorem 5.

For all bipartite density operators in product form, i.e., $\rho_{AB}=\rho_{A}\otimes\rho_{B}$ with $\rho_{A}\in\mathcal{S}(\mathcal{H}_{A})$ and $\rho_{B}\in\mathcal{S}(\mathcal{H}_{B})$ , the quantum Tsallis mutual information $I_{q}(A:B)_{\rho}$ is upper bounded as follows for all $q>0$

[TABLE]

where the function $f(q,d_{A},d_{B})$ is given by

[TABLE]

The bound is saturated if and only if $\rho_{AB}=\frac{\mathds{1}_{A}}{d_{A}}\otimes\frac{\mathds{1}_{B}}{d_{B}}$ .

Proof.

The proof goes through in the same way as the proof of Theorem 1 for the classical case (Properties 1 and 2 are analogous to those needed in the classical proof). ∎

Next, we generalise Theorem 2 and Corollaries 3 and 4. This would correspond to the causal constraints on quantum Tsallis entropies implied by the common cause causal structure with $C$ being a complete common cause of $A$ and $B$ (which share no causal relations among themselves). Here, one must be careful in precisely defining the conditional mutual information and interpreting it physically. For example, if the common case $C$ were quantum and the nodes $A$ and $B$ were classical outcomes of measurements on $C$ , then $A$ , $B$ and $C$ do not coexist and there is no joint state $\rho_{ABC}$ in such a case. This is a significant difference in quantum causal modelling compared to the classical case, and there have been several proposals for how do deal with it Leifer2013 ; Costa2016 ; Allen2017 ; Pienaar2019 . In the following we consider two cases:

When $C$ is classical, all 3 systems coexist and $\rho_{ABC}$ can be described by a classical-quantum state (See Theorem 7). 2. 2.

When $C$ is quantum, one approach is to view $\rho_{ABC}$ not as the joint state of the 3 systems but as being related to the Choi-Jamiolkowski representations of the quantum channels from $C$ to $A$ and $B$ (See Section A.1) as done in Allen2017 .

The following Lemma proven in Kim2016 is required for our generalization of Theorem 2 in the first case.

Lemma 6 (Kim2016 , Lemma 1).

Let $\mathcal{H}_{A}$ and $\mathcal{H}_{Z}$ be two Hilbert spaces and $\{|z\rangle\}_{z}$ be an orthonormal basis of $\mathcal{H}_{Z}$ . Let $\rho_{AZ}$ be classical on $\mathcal{H}_{Z}$ with respect to this basis i.e.,

[TABLE]

where $\sum_{z}p_{z}=1$ and $\rho_{A}^{(z)}\in S(\mathcal{H}_{A})\ \forall z$ . Then for all $q>0$ ,

[TABLE]

where $S_{q}(Z)$ is the classical Tsallis entropy of the variable $Z$ distributed according to $P_{Z}$ .

Note that the above Lemma immediately implies that

[TABLE]

Theorem 7.

Let $\rho_{ABC}=\sum_{c}p_{c}\rho_{AB}^{(c)}\otimes|c\rangle\!\langle c|$ , where $\rho_{AB}^{(c)}=\rho_{A}^{(c)}\otimes\rho_{B}^{(c)}$ $\forall c$ , then, for all $q\geq 1$ ,

[TABLE]

For $q>1$ the bound is saturated if and only if $\rho_{ABC}=\frac{\mathds{1}_{A}}{d_{A}}\otimes\frac{\mathds{1}_{B}}{d_{B}}\otimes|c\rangle\langle c|_{C}$ .

Proof.

Using (20) we have,

[TABLE]

The rest of the proof is analogous to Theorem 2, where using the above, Theorem 5 and defining the set $\mathcal{R}=\{\rho_{ABC}\in\mathcal{H}_{A}\otimes\mathcal{H}_{B}\otimes\mathcal{H}_{C}:\rho_{ABC}=\sum_{c}p_{c}\rho_{A}^{(c)}\otimes\rho_{B}^{(c)}\otimes|c\rangle\langle c|\}$ we have,

[TABLE]

where the last step follows because for all $q\geq 1$ , $\sum_{c}p_{c}^{q}$ is maximized by deterministic distributions over $C$ with a maximum value of $1$ 212121For $q>1$ such deterministic distributions are the only way to obtain the bound. and $I_{q}(A:B)_{\rho_{AB}^{(c)}}$ for product states is maximised by the maximally mixed state over $A$ and $B$ for all $c$ (Theorem 5). Thus, for $q>1$ , the bound is saturated if and only if $\rho_{ABC}=\frac{\mathds{1}_{A}}{d_{A}}\otimes\frac{\mathds{1}_{B}}{d_{B}}\otimes|c\rangle\langle c|_{C}$ for some value $c$ of $C$ . ∎

A.1 A generalisation: when systems do not coexist

There is a fundamental problem with naively generalising classical conditional independences such as $p_{XY|Z}=p_{X|Z}p_{Y|Z}$ to the quantum case by replacing joint distributions by density matrices: it is not clear what is meant by a conditional quantum state e.g., $\rho_{A|C}$ since it is not clear what it means to condition on a quantum system, specially when the (joint state of the) system under consideration and the one being conditioned upon do not coexist. There are a number of approaches for tackling this problem, from describing quantum states in space and time on an equal footing Horsman2016 to quantum analogues of Bayesian inference Leifer2013 and causal modelling Costa2016 ; Allen2017 ; Pienaar2019 . In the following, we will focus on one such approach that is motivated by the framework of Allen2017 . Central to this approach is the Choi-Jamiołkowski isomorphism Jamiolkowski1972 ; Choi1975 from which one can define conditional quantum states.

Definition 3 (Choi state).

Let $|\gamma\rangle=\sum_{i}|i\rangle_{R}|i\rangle_{R^{*}}\in\mathcal{H}_{R}\otimes\mathcal{H}_{R^{*}}$ , where $\mathcal{H}_{R^{*}}$ is the dual space to $\mathcal{H}_{R}$ and $\{|i\rangle_{R}\}_{i}$ , $\{|i\rangle_{R^{*}}\}_{i}$ are orthonormal bases of $\mathcal{H}_{R}$ and $\mathcal{H}_{R^{*}}$ respectively. Given a channel $\mathcal{E}_{R|S}:\mathcal{S}(\mathcal{H}_{R})\to\mathcal{S}(\mathcal{H}_{S})$ , the Choi state of the channel is defined by

[TABLE]

Thus, $\rho_{S|R}\in\mathcal{P}(\mathcal{H}_{S}\otimes\mathcal{H}_{R^{*}})$ .

Now, if a quantum system $C$ evolves through a unitary channel $\mathcal{E}_{I}(\cdot)=U^{\prime}(\cdot)U^{\prime\dagger}$ to two systems $A^{\prime}$ and $B^{\prime}$ where $U^{\prime}:\mathcal{H}_{C}\rightarrow\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{B^{\prime}}$ , it is reasonable to call the system $C$ a quantum common cause of the systems $A^{\prime}$ and $B^{\prime}$ . Further, this would still be reasonable if one were to then perform local completely positive trace preserving (CPTP) maps on the $A^{\prime}$ and $B^{\prime}$ systems. By the Stinespring dilation theorem, these local CPTP maps can be seen as local isometries followed by partial traces, and the local isometries can be seen as the introduction of an ancilla in a pure state followed by a joint unitary on the system and ancilla. This is illustrated in Figure 2 and is compatible with the definition of quantum common causes presented in Allen2017 . In other words, a system $C$ can be said to be a complete (quantum) common cause of systems $A$ and $B$ if the corresponding channel $\mathcal{E}:\mathcal{S}(\mathcal{H}_{C})\rightarrow\mathcal{S}(\mathcal{H}_{A}\otimes\mathcal{H}_{B})$ can be decomposed as in Figure 2 for some choice of unitaries $U^{\prime}$ , $U_{A}$ , $U_{B}$ and pure states $|\phi\rangle_{E_{A}}$ , $|\psi\rangle_{E_{B}}$ . Note that a more general set of channels fit the definition of quantum common cause in Ref. Allen2017 than we use here; whether the theorems here extend to this case we leave as an open question.

In Allen2017 it is shown that whenever a system $C$ is a complete common cause of systems $A$ and $B$ then the Shannon conditional mutual information evaluated on the state $\tau_{ABC^{*}}=\frac{1}{d_{A}}\rho_{AB|C}$ satisfies $I(A:B|C^{*})_{\tau}=0$ where $\rho_{AB|C}$ is the Choi state of the channel from $C$ to $A$ and $B$ . We generalise this result to Tsallis entropies for $q\geq 1$ for certain types of channels. We present the result in three cases, each with increasing levels of generality. These are explained in Figure 2 and correspond to the cases where the map from the complete common cause $C$ to its children $A$ and $B$ is (i) unitary ( $\mathcal{E}_{\textsc{i}}=U^{\prime}$ ); (ii) unitary followed by local isometries ( $\mathcal{E}_{\textsc{ii}}$ ); (iii) Unitary followed by local isometries followed by partial traces on local systems ( $\mathcal{E}_{\textsc{iii}}=\mathcal{E}$ ).

Lemma 8.

Let $\mathcal{E}_{\textsc{i}}:\mathcal{S}(\mathcal{H}_{C})\rightarrow\mathcal{S}(\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{B^{\prime}})$ be a unitary quantum channel i.e.,

[TABLE]

where $U^{\prime}:\mathcal{H}_{C}\rightarrow\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{B^{\prime}}$ is an arbitrary unitary operator. If $\rho_{A^{\prime}B^{\prime}|C}$ is the corresponding Choi state, then the Tsallis conditional mutual information evaluated on the state $\tau_{A^{\prime}B^{\prime}C^{*}}=\frac{1}{d_{C}}\rho_{A^{\prime}B^{\prime}|C}\in\mathcal{S}(\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{B^{\prime}}\otimes\mathcal{H}_{C^{*}})$ satisfies

[TABLE]

Proof.

The conditional mutual information $I_{q}(A^{\prime}:B^{\prime}|C^{*})_{\tau}$ can be written as

[TABLE]

We will now evaluate every term in the above expression for the case where the channel that maps the $C$ system to the $A^{\prime}$ and $B^{\prime}$ systems is unitary. In this case, $\tau_{A^{\prime}B^{\prime}C^{*}}$ is a pure state and can be written as $\tau_{A^{\prime}B^{\prime}C^{*}}=|\tau\rangle\langle\tau|_{A^{\prime}B^{\prime}C^{*}}$ where

[TABLE]

This means that $\operatorname{Tr}_{A^{\prime}B^{\prime}C^{*}}\tau_{A^{\prime}B^{\prime}C^{*}}^{q}=\operatorname{Tr}_{A^{\prime}B^{\prime}C^{*}}\tau_{A^{\prime}B^{\prime}C^{*}}$ $\forall q>0$ . Since $\tau_{A^{\prime}B^{\prime}C^{*}}$ is a valid quantum state, it must be a trace one operator and we have

[TABLE]

Further, we have $\tau_{C^{*}}=\operatorname{Tr}_{A^{\prime}B^{\prime}}\tau_{A^{\prime}B^{\prime}C^{*}}=\frac{\mathds{1}_{C^{*}}}{d_{C}}$ and hence

[TABLE]

The second step follows from the fact that $U^{\prime}:\mathcal{H}_{C}\rightarrow\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{B^{\prime}}$ is unitary so $d_{C}=d_{A^{\prime}}d_{B^{\prime}}$ .

Now, the marginals over $A^{\prime}$ and $B^{\prime}$ are $\tau_{A^{\prime}}=\operatorname{Tr}_{B^{\prime}C^{*}}\tau_{A^{\prime}B^{\prime}C^{*}}=\frac{\mathds{1}_{A^{\prime}}}{d_{A^{\prime}}}$ and $\tau_{B^{\prime}}=\operatorname{Tr}_{A^{\prime}C^{*}}\tau_{A^{\prime}B^{\prime}C^{*}}=\frac{\mathds{1}_{B^{\prime}}}{d_{B^{\prime}}}$ . By the Schmidt decomposition of $\tau_{A^{\prime}B^{\prime}C^{*}}$ , the non-zero eigenvalues of $\tau_{A^{\prime}}$ are the same as those of $\tau_{B^{\prime}C^{*}}$ . Since the Tsallis entropy depends only on the non-zero eigenvalues, $S_{q}(A^{\prime})=S_{q}(B^{\prime}C^{*})$ and hence

[TABLE]

By the same argument it follows that

[TABLE]

Combining Equations (21)-(26), we have

[TABLE]

∎

Lemma 9.

Let $\mathcal{E}_{\textsc{ii}}:\mathcal{S}(\mathcal{H}_{C})\rightarrow\mathcal{S}(\mathcal{H}_{\tilde{A}}\otimes\mathcal{H}_{\tilde{B}})$ be a quantum channel of the form

[TABLE]

where $U^{\prime}:\mathcal{H}_{C}\rightarrow\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{B^{\prime}}$ , $U_{A}:\mathcal{H}_{E_{A}}\otimes\mathcal{H}_{A^{\prime}}\rightarrow\mathcal{H}_{\tilde{A}}$ and $U_{B}:\mathcal{H}_{B^{\prime}}\otimes\mathcal{H}_{E_{B}}\rightarrow\mathcal{H}_{\tilde{B}}$ are arbitrary unitaries and $|\phi\rangle_{E_{A}}$ and $|\psi\rangle_{E_{B}}$ are arbitrary pure states. If $\rho_{\tilde{A}\tilde{B}|C}$ is the corresponding Choi state, then the Tsallis conditional mutual information evaluated on the state $\tau_{\tilde{A}\tilde{B}C^{*}}=\frac{1}{d_{C}}\rho_{\tilde{A}\tilde{B}|C}\in\mathcal{S}(\mathcal{H}_{\tilde{A}}\otimes\mathcal{H}_{\tilde{B}}\otimes\mathcal{H}_{C^{*}})$ satisfies

[TABLE]

Proof.

Note that the map $\mathcal{E}_{\textsc{ii}}$ is the unitary map $\mathcal{E}_{\textsc{i}}(\cdot)=U^{\prime}(\cdot)U^{\prime\dagger}$ followed by local isometries $V_{A}$ and $V_{B}$ on the $A^{\prime}$ and $B^{\prime}$ systems respectively. Since the expression for the conditional mutual information $I_{q}(\tilde{A}:\tilde{B}|C^{*})_{\tau}$ can be written in terms of entropies, which are functions of the eigenvalues of the relevant reduced density operators, and since the eigenvalues are unchanged by local isometries, this conditional mutual information is invariant under local isometries. The rest of the proof is identical to that of Lemma 8 resulting in

[TABLE]

∎

For the last case where $\mathcal{E}_{\textsc{iii}}(\cdot)=\operatorname{Tr}_{A^{\prime\prime}B^{\prime\prime}}\Big{[}(U_{A}\otimes U_{B})\big{[}|\phi\rangle\langle\phi|_{E_{A}}\otimes U^{\prime}(\cdot)U^{\prime\dagger}\otimes|\psi\rangle\langle\psi|_{E_{B}}\big{]}(U_{A}\otimes U_{B})^{\dagger}\Big{]}$ , one could intuitively argue that tracing out systems could not increase the mutual information and one would expect that

[TABLE]

Since $I_{q}(AA^{\prime\prime}:BB^{\prime\prime}|C^{*})_{\tau}=I_{q}(A:B|C^{*})_{\tau}+I_{q}(AA^{\prime\prime}:B^{\prime\prime}|BC^{*})_{\tau}+I_{q}(A^{\prime\prime}:B|AC^{*})_{\tau}$ , Equation (29) would follow from strong subadditivity used twice i.e., $I_{q}(AA^{\prime\prime}:B^{\prime\prime}|BC^{*})_{\tau}\geq 0$ and $I_{q}(A^{\prime\prime}:B|AC^{*})_{\tau}\geq 0$ . However, it is known that strong subadditivity does not hold in general for Tsallis entropies for $q>1$ Petz2014 . Ref. Petz2014 also provides a sufficiency condition for strong subadditivity to hold for Tsallis entropies. In the following Lemma, we provide another, simple sufficiency condition that also helps bound the Tsallis mutual information $I_{q}(AA^{\prime\prime}:B|C)_{\tau}$ (or $I_{q}(A:BB^{\prime\prime}|C)_{\tau}$ ) corresponding to the map $\mathcal{E}_{\textsc{iii}}$ where only one of $A^{\prime\prime}$ or $B^{\prime\prime}$ is traced out but not both.

Lemma 10 (Sufficiency condition for strong subadditivity of Tsallis entropies).

If $\rho_{ABC}$ is a pure quantum state, then for all $q\geq 1$ we have $I_{q}(A:B|C)_{\rho}\geq 0$ .

Proof.

We have

[TABLE]

Since $\rho_{ABC}$ is pure we have $S_{q}(ABC)=0$ $\forall q>0$ and (from the Schmidt decomposition argument mentioned earlier) $S_{q}(AC)=S_{q}(B)$ , $S_{q}(BC)=S_{q}(A)$ and $S_{q}(C)=S_{q}(AB)$ . Thus,

[TABLE]

which follows from subadditivity of quantum Tsallis entropies for $q\geq 1$ Audenaert2007 . In other words, for pure $\rho_{ABC}$ , strong subadditivity of Tsallis entropies is equivalent to their subadditivity which holds whenever $q\geq 1$ . ∎

Corollary 11.

Let $\mathcal{E}^{1}_{\textsc{iii}}:\mathcal{S}(\mathcal{H}_{C})\rightarrow\mathcal{S}(\mathcal{H}_{\tilde{A}}\otimes\mathcal{H}_{B})$ be a quantum channel of the form

[TABLE]

where $U^{\prime}:\mathcal{H}_{C}\rightarrow\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{B^{\prime}}$ , $U_{A}:\mathcal{H}_{E_{A}}\otimes\mathcal{H}_{A^{\prime}}\rightarrow\mathcal{H}_{\tilde{A}}\cong\mathcal{H}_{A}\otimes\mathcal{H}_{A^{\prime\prime}}$ and $U_{B}:\mathcal{H}_{B^{\prime}}\otimes\mathcal{H}_{E_{B}}\rightarrow\mathcal{H}_{\tilde{B}}\cong\mathcal{H}_{B}\otimes\mathcal{H}_{B^{\prime\prime}}$ are arbitrary unitaries and $|\phi\rangle_{E_{A}}$ and $|\psi\rangle_{E_{B}}$ are arbitrary pure states. If $\rho_{\tilde{A}B|C}$ is the corresponding Choi state, then the Tsallis conditional mutual information evaluated on the state $\tau_{\tilde{A}BC^{*}}=\frac{1}{d_{C}}\rho_{\tilde{A}B|C}\in\mathcal{S}(\mathcal{H}_{\tilde{A}}\otimes\mathcal{H}_{B}\otimes\mathcal{H}_{C^{*}})$ satisfies

[TABLE]

Proof.

Since $I_{q}(AA^{\prime\prime}:BB^{\prime\prime}|C^{*})_{\tau}=I_{q}(AA^{\prime\prime}:B|C^{*})_{\tau}+I_{q}(AA^{\prime\prime}:B^{\prime\prime}|BC^{*})_{\tau}$ , the purity of $\tau_{\tilde{A}\tilde{B}C^{*}}=\tau_{AA^{\prime\prime}BB^{\prime\prime}C^{*}}$ and Lemma 10 imply that

[TABLE]

or (equivalently) in more concise notation,

[TABLE]

Finally, using Lemma 9 we obtain the required result. ∎

Now, for Equation (29) to hold, we do not necessarily need strong subadditivity. Even if $I_{q}(A^{\prime\prime}:B|AC)_{\tau}\geq 0$ does not hold, Equation (29) would still hold if $I_{q}(AA^{\prime\prime}:B^{\prime\prime}|BC)_{\tau}+I_{q}(A^{\prime\prime}:B|AC)_{\tau}\geq 0$ . This motivates the following conjecture.

Conjecture 1.

Let $\mathcal{E}_{\textsc{iii}}:\mathcal{S}(\mathcal{H}_{C})\rightarrow\mathcal{S}(\mathcal{H}_{A}\otimes\mathcal{H}_{B})$ be a quantum channel of the form

[TABLE]

where $U^{\prime}:\mathcal{H}_{C}\rightarrow\mathcal{H}_{A^{\prime}}\otimes\mathcal{H}_{B^{\prime}}$ , $U_{A}:\mathcal{H}_{E_{A}}\otimes\mathcal{H}_{A^{\prime}}\rightarrow\mathcal{H}_{A}\otimes\mathcal{H}_{A^{\prime\prime}}$ and $U_{B}:\mathcal{H}_{B^{\prime}}\otimes\mathcal{H}_{E_{B}}\rightarrow\mathcal{H}_{B}\otimes\mathcal{H}_{B^{\prime\prime}}$ are arbitrary unitaries and $|\phi\rangle_{E_{A}}$ and $|\psi\rangle_{E_{B}}$ are arbitrary pure states. If $\rho_{AB|C}$ is the corresponding Choi state, then the Tsallis conditional mutual information evaluated on the state $\tau_{ABC^{*}}=\frac{1}{d_{C}}\rho_{AB|C}\in\mathcal{S}(\mathcal{H}_{A}\otimes\mathcal{H}_{B}\otimes\mathcal{H}_{C^{*}})$ satisfies

[TABLE]

Notice that in Corollary 11 and Conjecture 1, the bounds are functions of $d_{A^{\prime}}$ and $d_{B^{\prime}}$ and not of the dimensions of the systems $A$ and $B$ (those in the quantity on the left hand side). In the case that $d_{A}\geq d_{A^{\prime}}$ and $d_{B}\geq d_{B^{\prime}}$ , the fact that $f(q,d_{A},d_{B})$ is a strictly increasing function of $d_{A}$ and $d_{B}$ $\forall q\geq 0$ allows us to write $I_{q}(\tilde{A}:B|C^{*})_{\tau}\leq f(q,d_{\tilde{A}},d_{B})$ and $I_{q}(A:B|C^{*})_{\tau}\leq f(q,d_{A},d_{B})$ under the conditions of Corollary 11 and Conjecture 1 respectively. However, if $d_{A}\leq d_{A^{\prime}}$ and/or $d_{B}\leq d_{B^{\prime}}$ , the bounds $f(q,d_{\tilde{A}},d_{B})$ and $f(q,d_{A},d_{B})$ are tighter than the bound $f(q,d_{A^{\prime}},d_{B^{\prime}})$ and so not implied. However, based on the several examples that we have checked, we further conjecture the following.

Conjecture 2.

Under the same conditions as Conjecture 1

[TABLE]

Further, it is shown in Allen2017 that if $C$ is a complete common cause of $A$ and $B$ then the corresponding Choi state, $\rho_{AB|C}$ decomposes as $\rho_{AB|C}=(\rho_{A|C}\otimes\mathds{1}_{B})(\mathds{1}_{A}\otimes\rho_{B|C})$ or $\rho_{AB|C}=\rho_{A|C}\rho_{B|C}$ in analogy with the classical case where if a classical random variable $Z$ is a common cause of the random variables $X$ and $Y$ , then the joint distribution over these variables factorises as $p_{XY|Z}=p_{X|Z}p_{Y|Z}$ . Then we have that $\tau_{ABC^{*}}=\frac{1}{d_{C}}\rho_{AB|C}=\frac{1}{d_{C}}\rho_{A|C}\rho_{B|C}$ . By further analogy with the classical results of Section IV, one may also consider instead a state of the form $\hat{\sigma}_{ABCC^{*}}=\sigma_{C}\otimes\frac{1}{d_{C}}\rho_{A|C}\rho_{B|C}=\sigma_{C}\otimes\tau_{ABC^{*}}$ , where $\sigma_{C}\in\mathcal{S}(\mathcal{H}_{C})$ .222222This is the analogue of the statement $p_{ABC}=p_{C}p_{A|C}p_{B|C}$ for probability distributions. Note that $\hat{\sigma}_{ABCC^{*}}$ is a valid density operator on $\mathcal{H}_{A}\otimes\mathcal{H}_{B}\otimes\mathcal{H}_{C}\otimes\mathcal{H}_{C^{*}}$ .

Lemma 12.

The state $\hat{\sigma}_{ABCC^{*}}=\sigma_{C}\otimes\tau_{ABC^{*}}$ defined above satisfies

[TABLE]

whenever $I_{q}(A:B|C^{*})_{\tau}\leq f(q,d_{A},d_{B})$ holds for the state $\tau_{ABC^{*}}=\frac{1}{d_{A}}\rho_{AB|C}$ , where $\rho_{AB|C}$ represents the quantum channel from $C$ to $A$ and $B$ and $\sigma_{C}$ is the input quantum state to this channel.

Proof.

Since $\hat{\sigma}$ is a product state between the $C$ and $ABC^{*}$ subsystems, by the pseudo-additivity of quantum Tsallis entropies and the chain rule we have

[TABLE]

Now let $p_{c}$ be the distribution whose entries are the eigenvalues of $\sigma_{C}$ . We have $\operatorname{Tr}(\sigma_{C}^{q})=\sum_{c}p_{c}^{q}$ . Thus if $q>1$ , $\sum_{c}p_{c}^{q}\leq 1$ with equality if and only if $p_{c}=1$ for some value of $c$ . It follows that

[TABLE]

Therefore, if $I_{q}(A:B|C^{*})_{\tau}\leq f(q,d_{A},d_{B})$ , we also have $I_{q}(A:B|CC^{*})_{\hat{\sigma}}\leq f(q,d_{A},d_{B})$ . ∎

Bibliography52

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) Bell, J. S. Speakable and unspeakable in quantum mechanics (Cambridge University Press, 1987). https://doi.org/10.1017/CBO 9780511815676 . · doi ↗
2(2) Ekert, A. K. Quantum cryptography based on Bell’s theorem. Phys. Rev. Lett. 67 , 661–663 (1991). https://link.aps.org/doi/10.1103/Phys Rev Lett.67.661 .
3(3) Mayers, D. & Yao, A. Quantum cryptography with imperfect apparatus. In Proceedings of the 39th Annual Symposium on Foundations of Computer Science (FOCS-98) , 503–509 (IEEE Computer Society, Los Alamitos, CA, USA, 1998). http://doi.ieeecomputersociety.org/10.1109/SFCS.1998.743501 .
4(4) Barrett, J., Hardy, L. & Kent, A. No signalling and quantum key distribution. Physical Review Letters 95 , 010503 (2005). https://doi.org/10.1103/Phys Rev Lett.95.010503 . · doi ↗
5(5) Colbeck, R. Quantum and relativistic protocols for secure multi-party computation. Ph D Dissertation, University of Cambridge (2007). https://arxiv.org/abs/0911.3814 .
6(6) Colbeck, R. & Kent, A. Private randomness expansion with untrusted devices. Journal of Physics A 44 , 095305 (2011). https://iopscience.iop.org/article/10.1088/1751-8113/44/9/095305 .
7(7) Pironio, S. et al. Random numbers certified by Bell’s theorem. Nature 464 , 1021–4 (2010). https://www.nature.com/articles/nature 09008 .
8(8) Pitowski, I. Quantum probability – quantum logic. Springer-Verlag Berlin Heidelberg 321 (1989). https://www.springer.com/gp/book/9783662137352 .

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Analysing causal structures using Tsallis entropies

Abstract

I Introduction

II Shannon entropy and the entropy vector method

II.1 The Shannon cone

II.2 Entropy vectors and causal structure

Definition 1** (Blocked paths).**

Definition 2** (d-separation).**

III Tsallis entropies

III.1 Properties of Tsallis entropies

IV Causal constraints and Tsallis entropy vectors

Theorem 1**.**

Proof.

Theorem 2**.**

Proof.

Corollary 3**.**

Corollary 4**.**

Remark 1**.**

IV.1 Number of independent Tsallis entropic causal constraints

IV.2 Using Tsallis entropies in the entropy vector method

V New Tsallis entropic inequalities for the triangle causal structure

Remark 2**.**

Remark 3**.**

V.1 Looking for quantum violations

Remark 4**.**

VI Discussion

Acknowledgements.

Appendix A Quantum generalisations of Theorems 1 and 2

Theorem 5**.**

Proof.

Lemma 6** (Kim2016 , Lemma 1).**

Theorem 7**.**

Proof.

A.1 A generalisation: when systems do not coexist

Definition 3** (Choi state).**

Lemma 8**.**

Proof.

Lemma 9**.**

Proof.

Lemma 10** (Sufficiency condition for strong subadditivity of Tsallis entropies).**

Proof.

Corollary 11**.**

Proof.

Conjecture 1**.**

Conjecture 2**.**

Lemma 12**.**

Proof.

Definition 1 (Blocked paths).

Definition 2 (d-separation).

Theorem 1.

Theorem 2.

Corollary 3.

Corollary 4.

Remark 1.

Remark 2.

Remark 3.

Remark 4.

Theorem 5.

Lemma 6 (Kim2016 , Lemma 1).

Theorem 7.

Definition 3 (Choi state).

Lemma 8.

Lemma 9.

Lemma 10 (Sufficiency condition for strong subadditivity of Tsallis entropies).

Corollary 11.

Conjecture 1.

Conjecture 2.

Lemma 12.