Interventionist Counterfactuals on Causal Teams

Fausto Barbero (University of Helsinki); Gabriel Sandu (University of; Helsinki)

arXiv:1901.00593·cs.LO·January 4, 2019·CREST

Interventionist Counterfactuals on Causal Teams

Fausto Barbero (University of Helsinki), Gabriel Sandu (University of, Helsinki)

PDF

TL;DR

This paper extends team semantics to include causal reasoning with interventions, providing a unified logical framework for both observational and interventionist causal models, accommodating various structural assumptions.

Contribution

It introduces causal teams with invariance under interventions, formal languages for causal discourse, and a unified logical approach to different causal models and notions of causation.

Findings

01

Unified treatment of observational and causal aspects

02

Formal languages for deterministic and probabilistic causality

03

Framework captures various structural assumptions

Abstract

We introduce an extension of team semantics which provides a framework for the logic of manipulationist theories of causation based on structural equation models, such as Woodward's and Pearl's; our causal teams incorporate (partial or total) information about functional dependencies that are invariant under interventions. We give a unified treatment of observational and causal aspects of causal models by isolating two operators on causal teams which correspond, respectively, to conditioning and to interventionist counterfactual implication. We then introduce formal languages for deterministic and probabilistic causal discourse, and show how various notions of cause (e.g. direct and total causes) may be defined in them. Through the tuning of various constraints on structural equations (recursivity, existence and uniqueness of solutions, full or partial definition of the functions),…

Equations38

Y := f_{_} Y (X_{_} 1, \dots, X_{_} n)

Y := f_{_} Y (X_{_} 1, \dots, X_{_} n)

T ⊨= (X; Y) ⟺ for all s, s^{'} \in T, if s (X) = s^{'} (X) then s (Y) = s^{'} (Y)

T ⊨= (X; Y) ⟺ for all s, s^{'} \in T, if s (X) = s^{'} (X) then s (Y) = s^{'} (Y)

(X = 1 \land Y = 1) \supset (X = 0 □ \to Y = 0) .

(X = 1 \land Y = 1) \supset (X = 0 □ \to Y = 0) .

T ⊨ X = x □ \to ψ ⟺ T_{_} X = x ⊨ ψ .

T ⊨ X = x □ \to ψ ⟺ T_{_} X = x ⊨ ψ .

Y = y ∣ Y \neq = y ∣ = (X; Y) ∣ ψ \land χ ∣ ψ \land χ ∣ θ \supset χ ∣ X = x □ \to χ

Y = y ∣ Y \neq = y ∣ = (X; Y) ∣ ψ \land χ ∣ ψ \land χ ∣ θ \supset χ ∣ X = x □ \to χ

(S - E M) : For any team T, T ⊨ ψ or T ⊨ \neg ψ

(S - E M) : For any team T, T ⊨ ψ or T ⊨ \neg ψ

(S - C E M) : For every causal team T, T ⊨ θ □ \to χ or T ⊨ θ □ \to \neg χ

(S - C E M) : For every causal team T, T ⊨ θ □ \to χ or T ⊨ θ □ \to \neg χ

(I M P) : \frac{X = x □ \to ( Y = y □ \to χ )}{( X = x \land Y = y ) □ \to χ} (E X P) : \frac{( X = x \land Y = y ) □ \to χ}{X = x □ \to ( Y = y □ \to χ )}

(I M P) : \frac{X = x □ \to ( Y = y □ \to χ )}{( X = x \land Y = y ) □ \to χ} (E X P) : \frac{( X = x \land Y = y ) □ \to χ}{X = x □ \to ( Y = y □ \to χ )}

(P E R M) : \frac{X = x □ \to ( Y = y □ \to χ )}{Y = y □ \to ( X = x □ \to χ )}

(P E R M) : \frac{X = x □ \to ( Y = y □ \to χ )}{Y = y □ \to ( X = x □ \to χ )}

(C F - O U T) : \frac{X = x □ \to ( Y = y □ \to ψ )}{( X ^{'} = x ^{'} \land Y = y ) □ \to ψ};

(C F - O U T) : \frac{X = x □ \to ( Y = y □ \to ψ )}{( X ^{'} = x ^{'} \land Y = y ) □ \to ψ};

(S E L - O U T) : \frac{X = x □ \to ( ψ \supset χ )}{( X = x □ \to ψ ) \supset ( X = x □ \to χ )}

(S E L - O U T) : \frac{X = x □ \to ( ψ \supset χ )}{( X = x □ \to ψ ) \supset ( X = x □ \to χ )}

\sim α ∣ P r (χ) \leq ϵ ∣ P r (χ) \geq ϵ ∣ P r (χ) \leq P r (θ) ∣ P r (χ) \geq P r (θ)

\sim α ∣ P r (χ) \leq ϵ ∣ P r (χ) \geq ϵ ∣ P r (χ) \leq P r (θ) ∣ P r (χ) \geq P r (θ)

α ∣ ψ \land χ ∣ ψ \lor χ ∣ ψ ⊔ χ ∣ θ \supset ψ ∣ X = x □ \to ψ

α ∣ ψ \land χ ∣ ψ \lor χ ∣ ψ ⊔ χ ∣ θ \supset ψ ∣ X = x □ \to ψ

P r_{_} T (χ) := \frac{c a r d ({ s \in T ^{-} ∣ { s } ⊨ χ })}{c a r d ( T ^{-} )} .

P r_{_} T (χ) := \frac{c a r d ({ s \in T ^{-} ∣ { s } ⊨ χ })}{c a r d ( T ^{-} )} .

T ⊨ χ_{_} 1 \supset P r (χ_{_} 2) \leq ϵ ⟺ T^{χ_{_} 1} ⊨ P r (χ_{_} 2) \leq ϵ ⟺ \frac{c a r d ({ s \in ( T ^{χ_{_} 1} ) ^{-} ∣ { s } ⊨ χ _{_} 2 })}{c a r d (( T ^{χ_{_} 1} ) ^{-} )} \leq ϵ

T ⊨ χ_{_} 1 \supset P r (χ_{_} 2) \leq ϵ ⟺ T^{χ_{_} 1} ⊨ P r (χ_{_} 2) \leq ϵ ⟺ \frac{c a r d ({ s \in ( T ^{χ_{_} 1} ) ^{-} ∣ { s } ⊨ χ _{_} 2 })}{c a r d (( T ^{χ_{_} 1} ) ^{-} )} \leq ϵ

⟺ \frac{c a r d ({ s \in ( T ^{χ_{_} 1} ) ^{-} ∣ { s } ⊨ χ _{_} 2 })}{c a r d ( T ^{-} )} \frac{c a r d ( T ^{-} )}{c a r d (( T ^{χ_{_} 1} ) ^{-} )} \leq ϵ ⟺ \frac{P r _{_} T ( χ _{_} 1 \land χ _{_} 2 )}{P r _{_} T ( χ _{_} 1 )} \leq ϵ,

⟺ \frac{c a r d ({ s \in ( T ^{χ_{_} 1} ) ^{-} ∣ { s } ⊨ χ _{_} 2 })}{c a r d ( T ^{-} )} \frac{c a r d ( T ^{-} )}{c a r d (( T ^{χ_{_} 1} ) ^{-} )} \leq ϵ ⟺ \frac{P r _{_} T ( χ _{_} 1 \land χ _{_} 2 )}{P r _{_} T ( χ _{_} 1 )} \leq ϵ,

T ⊨_⨆ x \neq = x^{'}, y \neq = y^{'}, z [(F i x (z) \land X = x) □ \to Y = y] \land [(F i x (z) \land X = x^{'}) □ \to Y = y^{'}] .

T ⊨_⨆ x \neq = x^{'}, y \neq = y^{'}, z [(F i x (z) \land X = x) □ \to Y = y] \land [(F i x (z) \land X = x^{'}) □ \to Y = y^{'}] .

T ⊨_⨆ x \neq = x^{'}, y, z [(F i x (z) \land X = x) □ \to P r (Y = y) = 0] \land [(F i x (z) \land X = x^{'}) □ \to P r (Y = y) = 1] .

T ⊨_⨆ x \neq = x^{'}, y, z [(F i x (z) \land X = x) □ \to P r (Y = y) = 0] \land [(F i x (z) \land X = x^{'}) □ \to P r (Y = y) = 1] .

T ⊨_⨆ x \neq = x^{'}, y \neq = y^{'}, w F i x^{'} (w) □ \to [(X = x □ \to Y = y) \land (X = x^{'} □ \to Y = y^{'})] .

T ⊨_⨆ x \neq = x^{'}, y \neq = y^{'}, w F i x^{'} (w) □ \to [(X = x □ \to Y = y) \land (X = x^{'} □ \to Y = y^{'})] .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Interventionist Counterfactuals on Causal Teams

Fausto Barbero and Gabriel Sandu (University of Helsinki)

Abstract

We introduce an extension of team semantics ([14], [23]) which provides a framework for the logic of manipulationist theories of causation based on structural equation models, such as Woodward’s ([26]) and Pearl’s ([19]); our causal teams incorporate (partial or total) information about functional dependencies that are invariant under interventions. We give a unified treatment of observational and causal aspects of causal models by isolating two operators on causal teams which correspond, respectively, to conditioning and to interventionist counterfactual implication. We then introduce formal languages for deterministic and probabilistic causal discourse, and show how various notions of cause (e.g. direct and total causes) may be defined in them.

Through the tuning of various constraints on structural equations (recursivity, existence and uniqueness of solutions, full or partial definition of the functions), our framework can capture different causal models. We give an overview of the inferential aspects of the recursive, fully defined case; and we dedicate some attention to the recursive, partially defined case, which involves a shift of attention towards nonclassical truth values.

1 Introduction

Some modern accounts of causation, most eminently the framework of D. Lewis ([17]), link the notion of causality to that of counterfactual dependence. Recent approaches to the manipulationist analysis of causality (Pearl [19], Spirtes, Glymour & Scheines [21], Woodward [26], Hitchcock [13]) focus on counterfactuals whose antecedents express interventions, the key idea being that a cause can be intervened upon to determine its effect. Such theories articulate the analysis of the notion of intervention using the so-called structural equation models ([21], [19]); they will be our main concern in the present paper. Our goal is to show how the notions of counterfactual and causal dependence that arise from the manipulationist theories of causation can be expressed and incorporated in the logical framework provided by team semantics. In section 2, we briefly review the basics of structural equation modeling. In section 3, we review the notion of a team and show how to integrate it with causal structure. Sections 4 to 6 define the (causal) team semantics for (deterministic) atomic formulas, connectives, and operators corresponding to evidential and counterfactual reasoning. Section 7 briefly explores the properties of this language when evaluated in the context of recursive, fully defined systems; sections 8 and 9 sketch ideas for going beyond these restrictions. In section 10 we show how to enrich the languages with probabilistic statements. Finally, as an example, we show that our languages are adequate for expressing the notions of direct and total causation from Woodward ([26]). The reader can find a more extensive treatment of our subject (including the omitted proofs) in the preprint [2].

2 Structural equation models

The most basic objects in the structural equation modeling approach are variables, which we will denote with capital letters $X,Y...$ . Each variable $V$ can assume values (tipically denoted as $v,v^{\prime},v^{\prime\prime}$ …) within a certain range of objects, $Ran(V)$ . Variables are related to each other by structural equations, for example

[TABLE]

stating that $Y$ is determined as a function of $X_{\_}1,\dots,X_{\_}n$ . The use of the symbol $:=$ instead of an equality is to emphasize that the equation should be thought of as non-reversible111A structural equation is nothing else than a shorthand for a set of counterfactuals, to be taken as assumptions ([13]).. The set of arguments of function $f_{\_}Y$ , that is $\{X_{\_}1,\dots,X_{\_}n\}$ , is usually denoted as $PA_{\_}Y$ (the set of parents of $Y$ ; $Y$ is a child of each of the $X_{\_}i$ ). For other sets or sequences of variables, we will follow a different notational convention:

Notation 2.1.

•

We use boldface letters such as X to denote either a set $\{X_{\_}1,\dots,X_{\_}n\}$ of variables or a sequence of the same variables (in a fixed alphabetical order).

•

*We use x to denote a set or sequence of values, each of which is a value for exactly one of the variables in X. We leave the details of these correspondences between variables and values as non-formalized. *

•

$Ran(\mathbf{X})$ * is an abbreviation for $\prod_{\_}{X\in\mathbf{X}}Ran(X)$ . *

A structural equation model may contain an explicit description of the function $f$ (fully defined case) or not (partially defined case). In both cases, the structural equations determine a pattern of dependencies between variables, which can be represented as a graph (one arrow from each parent $X_{\_}i$ to the child $Y$ ).

An intervention $do(X=x)$ can be thought of as the act of replacing the equation for $X$ with a constant equation $X:=x$ . Correspondingly, all the arrows coming into $X$ are removed from the graph. Importantly, all the other structural equations are left untouched by the intervention. This aspect of the system of structural equations, called invariance (modularity) will be crucial in our developments.

A structural equation model is typically further enriched, in the literature, with an assignment of values to the exogenous variables (deterministic case), or with a joint probability distribution over the exogenous variables (semi-deterministic case). If the graph underlying the model is acyclic, this assignment or probability distribution can be canonically extended to the whole variable domain. At this stage it becomes possible to evaluate counterfactual statements over the model: for example, $X=x\hskip 2.0pt\Box\rightarrow\psi$ holds under the current assignment/probability distribution if $\psi$ holds after the intervention $do(X=x)$ .

3 Causal teams

Team semantics was introduced by Hodges ([14]) to provide a compositional presentation of the (game-theoretically defined) semantics of Independence-Friendly logic ([12],[18]). In the subsequent years, team semantics has been used to extend first-order logic by database dependencies (e.g. [23], [10], [8]); and to enrich propositional logics (e.g. [27]) and modal logics ([24]). Appropriate generalizations have been used as descriptive languages for probabilistic dependencies ([6]), quantum phenomena ([15]), Bayes nets ([5]).

The basic idea of team semantics is that notions such as dependence and independence, which express properties of relations, cannot be captured by Tarskian semantics, which evaluates formulas on single assignments222This can be formally proved, see [4].; the appropriate unit for semantical evaluation is instead the team, i.e., a set of assignments (all sharing a common variable domain). In our context, an assignment can be thought of as a way to encode a possible configuration for the values of variables; once a set $Dom$ of variables is fixed, each assignment will be a function $s:Dom\rightarrow\bigcup_{\_}{X\in Dom}Ran(X)$ such that $s(X)\in Ran(X)$ for each $X\in Dom$ (in the statistical literature, $s$ would be called an individual). A team $T$ of domain $dom(T)=Dom$ is a set of such assignments.

A significant example of a property that can be satisfied by a team is functional dependence (among variables). The formula $=({\mathbf{X}};{Y})$ , called a dependence atom, has the intended meaning: the (values) of the variable Y are functionally determined by (the values) of the set of variables $\mathbf{X}$ . Its truth in a team $T$ is defined by the following clause:

[TABLE]

where $s(\mathbf{X})=s^{\prime}(\mathbf{X})$ is an abbreviation for “ $s(X_{\_}1)=s^{\prime}(X_{\_}1)$ and… and $s(X_{\_}n)=s^{\prime}(X_{\_}n)$ ”.

Teams turned out to be a very useful framework for describing data-driven correlations. But they are not sufficient, by themselves, to handle causal dependencies. The latter require that the functional correlations be robust, i.e. invariant under interventions. We thus need to enrich teams with a set of functions, the invariant functions, which are the carriers of causal dependencies333The invariant functions will univoquely associate a set of structural equations to the enriched team.; and we need to formulate the notion of intervention. We now move to technicalities.

Given a team $T^{-}$ and $X\in dom(T^{-})$ , we write $T^{-}(X)$ for the set of values that are obtained for $X$ in the team $T^{-}$ : $T^{-}(X):=\{s(X)|s\in T^{-}\}$ . As before, we say that $T^{-}$ satisfies a dependence atom $=({\mathbf{X}};{Y})$ , and we write $T^{-}\models=({\mathbf{X}};{Y})$ , if, whenever $s(\mathbf{X})=s^{\prime}(\mathbf{X})$ for all $s,s^{\prime}\in T^{-}$ , we have $s(Y)=s^{\prime}(Y)$ .

Def 3.1.

A causal team $T$ over variable domain $dom(T)$ with endogenous variables $\mathbf{V}\subseteq dom(T)$ is a quadruple $T=(T^{-},G(T),\mathcal{R}_{\_}T,\mathcal{F}_{\_}T)$ , where:

$T^{-}$ * is a team.* 2. 2.

$G(T)=(dom(T),E)$ * is a graph over the set of variables. For any $X\in dom(T)$ , we denote as $PA_{\_}X$ the set of all variables $Y\in dom(T)$ such that the arrow $(Y,X)$ is in $E$ .* 3. 3.

$\mathcal{R}_{\_}T=\{(X,Ran(X))|X\in dom(T)\}$ * (where the $Ran(X)$ may be arbitrary sets) is a function which assigns a range to each variable* 4. 4.

$\mathcal{F}_{\_}T$ * is a function $\{(V_{\_}i,f_{\_}{V_{\_}i})|V_{\_}i\in\mathbf{V}\}$ that assigns to each endogenous variable a $|PA_{\_}{V_{\_}i}|$ -ary function $f_{\_}{V_{\_}i}:dom(f_{\_}{V_{\_}i})\rightarrow ran(V_{\_}i)$ (for some ${dom(f_{\_}{V_{\_}i})\subseteq Ran(PA_{\_}{V_{\_}i})}$ )*

which satisfies the further restrictions:

a)

$T^{-}(X)\subseteq Ran(X)$ * for each $X\in dom(T)$ * 2. b)

If $PA_{\_}Y=\{X_{\_}1,\dots,X_{\_}n\}$ , then $T^{-}\models=({X_{\_}1,\dots,X_{\_}n};{Y})$ 3. c)

if $s\in T^{-}$ is such that $s(PA_{\_}Y)\in dom(f_{\_}Y)$ , then $s(Y)=f_{\_}Y(s(PA_{\_}Y))$ .

In case $dom(f_{\_}V)=Ran(PA_{\_}V)$ for each $V\in\mathbf{V}$ , we say the causal team is fully defined; otherwise it is partially defined. If the graph $G(T)$ is acyclic, we say $T$ is recursive; otherwise nonrecursive.

We will assume for the rest of the paper that $dom(T)$ , and therefore $G(T)$ , is finite.

Clause b) ensures that the team component $T^{-}$ satisfy (at least) the dependencies encoded in the graph $G(T)$ . Clause c) further ensures that the team component is consistent with the invariant functions encoded in $\mathcal{F}_{\_}T$ . The graph $G(T)$ is induced (via b) and c) ) by the set of functional dependencies specified by clause (4), and provides a distinction between endogenous variables, that are determined by one of these invariant dependencies, and exogenous variables (those in $dom(T)\setminus\mathbf{V}$ ), that are not.

Example 3.2.

Consider a causal team $T$ with underlying team $T^{-}=\{\{(U,2),(X,1),(Y,2),(Z,4)\},$ $\{(U,3),$ $(X,1),(Y,2),(Z,4)\},\{(U,1),(X,3),(Y,3),$ $(Z,1)\},\{(U,1),(X,4),(Y,1),(Z,1)\},\{(U,4),(X,4),$ $(Y,1),(Z,1)\}\}$ , graph $G(T)=(\{U,$ $X,Y,Z\},$ $\{(U,Z),(X,Y),(X,Z),(Y,Z)\})$ , ranges $Ran(U)=Ran(X)$ $=Ran(Y)=Ran(Z)=\{1,2,3,4\}$ , and partial description of (one value of) the invariant function for $Z$ : $\mathcal{F}(Z)(4,1,2):=3$ . We represent the $T^{-}$ and $G(T)$ components of $T$ by means of a decorated table:

[TABLE]

4 A basic language for causal teams

We need first of all to specify what it means for a causal team to satisfy an atomic formula, and to assign a semantics to connectives. Our language consists of formulas built using the connectives $\land$ and $\lor$ (“tensor” disjunction), dependence atoms, and atomic formulas of the forms $Y=y$ and $Y\neq y$ . The semantic clause for disjunction requires the notion of causal subteam:

Def 4.1.

Given a causal team $T$ , a causal subteam $S$ of $T$ is a causal team with the same domain and the same set of endogenous variables, which satisfies: 1) $S^{-}\subseteq T^{-}$ , 2) $\mathcal{R}_{\_}S=\mathcal{R}_{\_}T$ , 3) $G(S)=G(T)$ , 4) $\mathcal{F}_{\_}S=\mathcal{F}_{\_}T$ 444Alternatively, one might consider enriching the component $\mathcal{F}_{\_}S$ with the information about invariant functions which is lost in passing from the team to the subteam..

The semantic clause for dependence atoms was given above. The other clauses are:

•

$T\models Y=y$ (resp. $T\models Y\neq y$ ) if, for all $s\in T^{-}$ , $s(Y)=y$ (resp. $s(Y)\neq y$ )

•

$T\models\psi\land\chi$ if $T\models\psi$ and $T\models\chi$ .

•

$T\models\psi\lor\chi$ if there are causal subteams $T_{\_}1,T_{\_}2$ of $T$ s.t. $T_{\_}1^{-}\cup T_{\_}2^{-}=T^{-}$ , $T_{\_}1\models\psi$ and $T_{\_}2\models\chi$ .555Notice that it might be impossible to define consistently the union of two causal teams.

5 Selective implication

Our main goal is to give an exact semantics to counterfactual statements of the form “If $\psi$ had been the case, then $\chi$ would have been the case”. Very often, however, one finds examples in the literature where these statements are embedded into a larger context. Pearl ([19]) analyzes the following query: “what is the probability $Q$ that a subject who died under treatment $(X=1,Y=1)$ would have recovered $(Y=0)$ had he or she not been treated $(X=0)$ ?

The representation of the statement whose probability Pearl is interested in seems to be:

[TABLE]

where the symbol $\hskip 2.0pt\Box\rightarrow$ stands for counterfactual implication, while the symbol $\supset$ , called selective implication, denotes a connective which is a generalization of material implication to teams666To the best of our knowledge, this connective has been used, with different notation, in [9], as a special case of the maximal implication introduced in [16].. It serves to restrict, in this example, the range of application of the counterfactual to the available evidence. More generally, given a causal team $T$ , and a formula $\psi$ without dependence atoms, define the causal subteam $T^{\psi}$ by the condition $(T^{\psi})^{-}=\{s\in T^{-}|s\models\psi\}$ , where the relationship $\models$ for single assignments is intended as in classical logic: $s\models Z=z$ if $s(Z)=z$ , etc. We define selective implication by the clause:

•

$T\models\psi\supset\chi$ iff $T^{\psi}\models\chi$ .

The consequent $\chi$ can be any formula of the current language. Instead, we require the antecedent to be a formula which denotes properties of single assignments. It is straightforward to extend the clause above in order to allow the use of $\supset$ (and the counterfactual $\hskip 2.0pt\Box\rightarrow$ , yet to be defined) in antecedents.

Example 5.1.

The selective implication $Z=3\supset Y=2$ holds on any causal team $T$ which has the table depicted below. In order to see that the formula holds on it, we have to construct the subteam $T^{Z=3}$

$T:$ * *

Z Y X

1 2 3

2 1 1

3 2 1

3 2 2

$\leadsto$ $T^{Z=3}:$ *

Z Y X

3 2 1

3 2 2

**

which is obtained by selecting the third and fourth row of $T$ (the rows that satisfy $Z=3$ ). Notice that $T^{Z=3}\models Y=2$ ; by the semantical clause, then, $T\models Z=3\supset Y=2$ .

6 Intervention

We define an (interventionist) counterfactual implication. Its semantics will be determined by a notion of intervention on a causal team. We may think of a (causal) team as an incomplete description of our knowledge concerning the state of a system: each assignment represents a configuration of values for the variables that we consider possible, even though we do not know which specific assignment encodes the actual state of the system. If we perform an intervention on the system, say $do(X=1)$ , then we know that, whatever the correct assignment is, our intervention is an action that enforces the values of the variable $X$ to take value $1$ , and removes any causal link from other variables to $X$ ; it is then reasonable to apply these changes to the whole team. The change will then propagate to the descendants of $X$ by means of the functions specified by the fourth component of the causal team.

Example 6.1.

Suppose we want to evaluate $X=1\hskip 2.0pt\Box\rightarrow Y=2$ in the causal team of Example 3.2. We need to generate a causal team $T_{\_}{X=1}$ which differs from the initial one in that variable $X$ is fixed, in all assignments, to value 1. This will affect all descendants of $X$ (in this case, the children $Y$ and $Z$ ).

[TABLE]

$\leadsto$ * *

*U ** **X ** **Y ** *Z

2 1 $\dots$ $\dots$

3 1 $\dots$ $\dots$

1 1 $\dots$ $\dots$

4 1 $\dots$ $\dots$

$\leadsto$ ** *

*U ** **X ** **Y ** *Z

2 1 2 $\dots$

3 1 2 $\dots$

1 1 2 $\dots$

4 1 2 $\dots$

$\leadsto$ ** *

*U ** **X ** **Y ** *Z

2 1 2 4

3 1 2 4

1 1 2 $\hat{f}_{\_}Z(1,1,2)$

4 1 2 3

**

In the first step, we changed the value of $X$ to $1$ in all rows. Next, the $Y$ column was filled using the fact that, according to the graph, $Y$ is determined by $X$ ; and observing that, in the initial team, rows that have value $1$ for $X$ have value $2$ for $Y$ . Finally, we evaluated $Z$ (which could not have been done until we knew the values for $Y$ ); the procedure is composite. In the first and second row we obtained the value 4 for $Z$ as before, by checking, on the initial team, the rows that assume values $(2,1,2)$ (resp. $(3,1,2)$ ) over $U,X$ and $Y$ . For the fourth row, we made use of the invariant functions: $\mathcal{F}_{\_}T(Z)(4,1,2)=3$ . The value that $Z$ should assume in the third row cannot be reconstructed by looking at the initial team $T^{-}$ , nor by using the information stored in $\mathcal{F}_{\_}T$ ; this can happen if the team is partially defined. Therefore, we insert a formal term to remind ourselves that the value for $Z$ in this row should be obtained applying an appropriate function $f_{\_}Z(U,X,Y)$ to the triple $(1,1,2)$ (if only we knew what what function it is). We wrote $\hat{f}_{\_}Z$ as a formal symbol distinguished from the function $f_{\_}Z$ .777More generally, complex terms with composition of many formal function symbols may be generated. Notice now that we have no uncertainties about the $Y$ column; so, it is natural to state that $T_{\_}{X=1}\models Y=2$ , and that, therefore, $T\models X=1\hskip 2.0pt\Box\rightarrow Y=2$ .

One must be careful in working out the details of the algorithm which constitutes an intervention. The order of the updates of the descendents of $X$ turns out not to be trivial, and it is not clear a priori whether the algorithm will terminate. In case the causal team is partially defined, as in the example, there is also the problem that the information encoded in the causal team may turn out to be insufficient for generating, under intervention, a proper causal team, and thereby we must admit teams which assign formal terms to some variables.

We begin considering the simplest case of recursive, fully defined causal teams. To take care of the order of the updating of variables, we introduce a notion of distance between (sets of) variables.

Def 6.2.

Given a graph $G=(\mathbf{V},E)$ and $\mathbf{X}\subseteq\mathbf{V}$ ,

•

We denote as $G_{\_}{-\mathbf{X}}=(\mathbf{V},E_{\_}{-\mathbf{X}})$ the graph obtained by removing all arrows going into some vertex of $\mathbf{X}$ (i.e., an edge $(V_{\_}1,V_{\_}2)$ is in $E_{\_}{-\mathbf{X}}$ iff it is in $E$ and $V_{\_}2\notin\mathbf{X}$ ). Notice that, in the special case that $\mathbf{X}=\{X\}$ , the set of directed paths of $G_{\_}{-\mathbf{X}}$ starting from $X$ coincides with the set of directed paths of $G$ starting from $X$ .

•

Let $Y\in\mathbf{V}$ . We call (evaluation) distance between $\mathbf{X}$ and $Y$ the value $d_{\_}G(\mathbf{X},Y)=sup\{length(P)|$ $P\text{ directed path of$ G_{_}{-\mathbf{X}} $going from some$ X\in\mathbf{X} $}\text{ to }Y\}$ . In case no such path exists, $d_{\_}G(\mathbf{X},Y):=-1$ . Clearly, if the graph is finite and acyclic, $d_{\_}G(\mathbf{X},Y)\in\mathbb{N}\cup\{-1\}$ for any pair $\mathbf{X},Y$ .

We write $\mathbf{X}=\mathbf{x}$ as an abbreviation for a conjunction of the form $X_{\_}1=x_{\_}1\land\dots\land X_{\_}n=x_{\_}n$ . Let $\mathbf{X}=\mathbf{x}$ be a consistent conjunction (that is, it does not contain two conjuncts of the form $X=x$ and $X=x^{\prime}$ , for $x\neq x^{\prime}$ ). Then, applying the algorithm $do(\mathbf{X}=\mathbf{x})$ to a recursive, fully defined causal team $T$ amounts to:

Stage [math]. Delete all arrows coming into $\mathbf{X}$ , and replace each assignment $s\in T$ with $s(\mathbf{x}/\mathbf{X})$ . Denote the resulting team888A warning: the teams $T_{\_}n$ produced before the last step of the algorithm may fail to form a causal team together with the other components described, because of violations of conditions b) and c) of the definition of causal team. as $T_{\_}0$ . Replace $\mathcal{F}_{\_}T$ with its restriction $\mathcal{F}_{\_}T^{\prime}$ to $\mathbf{V}\setminus\mathbf{X}$ .

Stage $n+1$ . If $\{Z_{\_}1,\dots,Z_{\_}{k_{\_}{n+1}}\}$ is the set of all the variables $Z_{\_}j$ such that $d_{\_}{G(T)}(\mathbf{X},Z_{\_}j)=n+1$ , define a team $T_{\_}{n+1}$ by replacing each $s\in T_{\_}n$ with the assignment $s(f_{\_}{Z_{\_}1}(s(PA_{\_}{Z_{\_}1}))/Z_{\_}1,\dots,f_{\_}{Z_{\_}{k_{\_}{n+1}}}(s(PA_{\_}{Z_{\_}{k_{\_}{n+1}}}))/Z_{\_}{k_{\_}{n+1}})$ .

End the procedure after step $\hat{n}=sup\{d_{\_}{G(T)}(\mathbf{X},Z)|Z\in dom(T)\}$ .

In case the intervention $do(\mathbf{X}=\mathbf{x})$ is a terminating algorithm on $T$ , we define the causal team $T_{\_}{\mathbf{X}=\mathbf{x}}$ (of endogenous variables $\mathbf{V}\setminus\mathbf{X}$ ) as the quadruple $(T^{\hat{n}},G(T)_{\_}{-\mathbf{X}},\mathcal{R}_{\_}T,\mathcal{F}_{\_}T^{\prime})$ which is produced when $do(\mathbf{X}=\mathbf{x})$ is applied to $T$ . It is straightforward to prove (even in case the causal team has infinite ranges for some variables) that

Theorem 6.3.

If $G(T)$ is a finite acyclic graph, then $T_{\_}{\mathbf{X}=\mathbf{x}}$ is well-defined.

Furthermore, our definition of intervention has a kind of internal consistency: applying $do(\mathbf{X}=\mathbf{x})$ is the same as sequentially applying interventions of the form $do(X=x)$ , for each conjunct $X=x$ of $\mathbf{X}=\mathbf{x}$ , in any order. This statement is a special case of the following two results:

Theorem 6.4.

Let $T$ be a recursive causal team, $\mathbf{X},\mathbf{Y}\in dom(T)$ such that $\mathbf{X}\cap\mathbf{Y}=\emptyset$ , $\mathbf{x}\in Ran(\mathbf{X})$ , and $\mathbf{y}\in Ran(\mathbf{Y})$ . Then $T_{\_}{\mathbf{X}=\mathbf{x}\land\mathbf{Y}=\mathbf{y}}=(T_{\_}{\mathbf{X}=\mathbf{x}})_{\_}{\mathbf{Y}=\mathbf{y}}$ .

Theorem 6.5.

Let $T$ be a recursive causal team, $\mathbf{X},\mathbf{Y}\in dom(T)$ such that $\mathbf{X}\cap\mathbf{Y}=\emptyset$ , $\mathbf{x}\in ran(\mathbf{X})$ , and $\mathbf{y}\in ran(\mathbf{Y})$ . Then $(T_{\_}{\mathbf{X}=\mathbf{x}})_{\_}{\mathbf{Y}=\mathbf{y}}=(T_{\_}{\mathbf{Y}=\mathbf{y}})_{\_}{\mathbf{X}=\mathbf{x}}$ .

The first of these two theorems is proved by a somewhat complex double induction argument on the distances $d(\mathbf{X},Z)$ and $d(\mathbf{Y},Z)$ (for each variable $Z$ ). See [2] for details. The second theorem follows easily from the first: under the hypotheses, Theorem 6.4 entails the equalities $(T_{\_}{\mathbf{X}=\mathbf{x}})_{\_}{\mathbf{Y}=\mathbf{y}}=T_{\_}{\mathbf{X}=\mathbf{x}\land\mathbf{Y}=\mathbf{y}}$ and $(T_{\_}{\mathbf{Y}=\mathbf{y}})_{\_}{\mathbf{X}=\mathbf{x}}=T_{\_}{\mathbf{Y}=\mathbf{y}\land\mathbf{X}=\mathbf{x}}$ ; but since the order of variables is irrelevant in the definition of the $do$ algorithm, we also have $T_{\_}{\mathbf{X}=\mathbf{x}\land\mathbf{Y}=\mathbf{y}}=T_{\_}{\mathbf{Y}=\mathbf{y}\land\mathbf{X}=\mathbf{x}}$ ; transitivity yields the desired result.

Having defined the intervened team $T_{\_}{\mathbf{X}=\mathbf{x}}$ , we are immediately led to a semantical clause for counterfactuals of the form $\mathbf{X}=\mathbf{x}\hskip 2.0pt\Box\rightarrow\psi$ :

[TABLE]

In case the antecedent is inconsistent (i.e., it contains two conjuncts $X_{\_}i=x_{\_}i,X_{\_}i=x_{\_}i^{\prime}$ with $x_{\_}i\neq x_{\_}i^{\prime}$ ), the corresponding intervention is not defined; in this case, we postulate the counterfactual to be true.

7 The logic of recursive, fully defined causal teams

We call the (basic) language of causal dependence, $\mathcal{CD}$ , the language formed by the following rules:

[TABLE]

for $Y,\mathbf{X}$ variables, $y,\mathbf{x}$ values, $\psi,\chi$ formulae of $\mathcal{CD}$ , $\theta$ formula of $\mathcal{CD}$ without dependence atoms. The semantics for this language, evaluated over recursive fully defined causal teams, is given by the clauses presented in earlier sections. We also call $\mathcal{CO}$ (the causal-observational language) the fragment of $\mathcal{CD}$ which lacks dependence atoms. We consider also an extension $\mathcal{CO}^{neg}$ of $\mathcal{CO}$ with a “dual” negation, whose semantics is defined by:

•

$T\models\neg\psi$ iff for all $s\in T^{-}$ , $\{s\}\not\models\psi$ .999This atypical formulation of dual negation is justified by the flatness of the language $\mathcal{CO}$ , entailed by Theorem 7.3.

(Here, and in the following, we abuse notation and write $\{s\}$ for the causal subteam $S$ of $T$ whose support $S^{-}$ is the singleton team $\{s\}$ .)

In this section, we will give a short overview of the logical properties of these languages; we refer the reader to the preprint [2] for a more detailed account. First of all, we underline some global properties:

Theorem 7.1.

The logic $\mathcal{CD}$ is downwards closed, that is: if $\varphi\in\mathcal{CD}$ , $T$ is a recursive101010This statement (as the next one) holds more generally for fully defined causal teams with at most unique solution, to be introduced in a later section. fully defined causal team, $T^{\prime}$ is a causal subteam of $T$ , and $T\models\varphi$ , then also $T^{\prime}\models\varphi$ .

Theorem 7.2.

The logic $\mathcal{CD}$ has the empty team property, that is: for every recursive, fully defined causal team $T$ with support $T^{-}=\emptyset$ , and every $\varphi\in\mathcal{CD}$ with variables in $dom(T)$ , $T\models\varphi$ .

Theorem 7.3.

The logic $\mathcal{CO}^{neg}$ is flat, that is: for every formula $\varphi$ of $\mathcal{CO}^{neg}$ and every recursive, fully defined causal team $T$ , $T\models\varphi$ iff $\{s\}\models\varphi$ for every assignment $s\in T^{-}$ .

This last result shows that our approach is in a sense a “conservative extension” of the structural equation modeling approach: as long as the language is poor enough, the semantics of causal teams can be reduced to that of deterministic structural equation models (which can be identified with causal teams with singleton support). However, in presence of other operators (e.g. dependence atoms, or the probabilistic atoms and boolean disjunction that will be considered in the following sections) the use of causal teams is essential.

The proofs of these three theorems are routine inductions on the syntax of formulas. However, we wish to point out that the third theorem makes an essential use of the following fact: by applying an intervention to a causal team whose support is a singleton set, one obtains again a causal team with singleton support. This is a property which is guaranteed for recursive causal teams, or, more generally, for fully defined causal teams with unique solutions (see next section).

Let us consider some further logical features of our framework. Unlike in the structural equation framework, the stronger variant of the law of excluded middle

[TABLE]

fails. Here is a very simple counterexample to it; the team

[TABLE]

does not satisfy $X=1$ nor its negation $X\neq 1$ . A similar example shows that the following strong form of the law of conditional excluded middle

[TABLE]

fails. Within $\mathcal{CO}^{neg}$ , however, the internalized versions of these laws (i.e. the statements that, for all recursive fully defined teams $T$ , $T\models\psi\lor\neg\psi$ , resp. $T\models\chi\hskip 2.0pt\Box\rightarrow(\psi\lor\neg\psi)$ ) are valid, due to flatness.

Three laws that are often considered in relation to natural language and Lewis-Stalnaker counterfactuals (see e.g. [20]) are the so-called importation, exportation and permutation laws; there are counterexamples for them in both contexts. Two results mentioned before (Theorems 6.4 and 6.5) provide sufficient conditions for the validity of these laws; their assumptions can be further relaxed as follows: assuming that the conjunction $\mathbf{X}=\mathbf{x}\land\mathbf{Y}=\mathbf{y}$ is consistent, the following rules of inference

[TABLE]

are sound. More generally, the following “overwriting” rule (similar to an axiom discovered in [3]) can be applied also in case $\mathbf{X}=\mathbf{x}\land\mathbf{Y}=\mathbf{y}$ is inconsistent:

[TABLE]

here $\mathbf{X^{\prime}}=\mathbf{x^{\prime}}$ is a conjunction of all the atoms of $\mathbf{X}=\mathbf{x}$ that contain no occurrences of variables from $\mathbf{Y}$ .

Galles&Pearl ([7]) and Halpern ([11]) provide an axiom system for (a case slightly more general than) recursive structural equation models. Their system can be adapted to our language $\mathcal{CO}^{neg}$ using the trick of transforming material implications into rules of inference. The resulting system (see [2]) is sound. However, $\mathcal{CO}^{neg}$ is more general than Halpern’s language in that we allow counterfactuals and selective implications to occur in the consequents of counterfactuals111111This has important consequences, such as the failure of modus ponens for $\hskip 2.0pt\Box\rightarrow$ , and the failure of a version of Lewis’s weak centering axiom. See also [3].. Therefore, in order to obtain a completeness result, we need extra rules in order to extract these kinds of implications from consequents, or, vice versa, in order to insert them into consequents. The elimination and introduction of consequents can be performed by using the overwriting rule CF-OUT and an appropriate inverse, in case this consequent is a counterfactual statement; in case it is a selective implication, one can use the rule

[TABLE]

and its inverse. Similar extraction and introduction rules are available for the connectives $\land$ , $\lor$ and $\neg$ .

8 Interventions on more general classes of causal teams

We consider here the possibility of extending the notion of intervention on a causal team beyond the recursive, fully defined case.

8.1 The (recursive) partially defined case

In the case of a recursive, partially defined team $T$ , we first transform $T$ into an appropriate fully defined team $T^{\prime}$ , and then apply the algorithm from section 6 to $T^{\prime}$ . In order to define $T^{\prime}$ , we need first of all to extend the ranges of the variables of $T$ by allowing them to take as values also formal terms, as in example 6.1. We call $L_{\_}{G(T)}$ the set of function symbols $\hat{f}_{\_}X$ (of arity $card(PA_{\_}X)$ ), for each endogenous variable $X$ . We call $G(T)$ -terms the terms generated from variables in $dom(T)$ and from symbols in $L_{\_}{G(T)}$ by the obvious inductive rules; the set of all $G(T)$ -terms will be denoted as $Term_{\_}{G(T)}$ . We then define the range component of $T^{\prime}$ by: $\mathcal{R}_{\_}{T^{\prime}}(X)=\mathcal{R}_{\_}T(X)\cup Term_{\_}{G(T)}$ 121212Actually, only a finite number of formal terms are needed in each intervention. It is therefore possible to give a more restrictive definition which preserves the finiteness of variable ranges. for each $X\in G(T)$ .

Secondly, for $T^{\prime}$ to be fully defined, we need the domains of the invariant functions to coincide with the ranges of the parent variables ( $dom(f_{\_}X)=Ran(PA_{\_}X)$ ). Therefore, we have to redefine each $\mathcal{F}_{\_}T(X)$ component over the whole range $\mathcal{R}_{\_}T(PA_{\_}X)$ . Let $pa_{\_}X\in\mathcal{R}_{\_}{T^{\prime}}(X)$ be a sequence of values for $PA_{\_}X$ . There are three possible cases: 1) $pa_{\_}X\in dom(\mathcal{F}_{\_}T(X))$ ; in this case we keep $\mathcal{F}_{\_}{T^{\prime}}(X)(pa_{\_}X):=\mathcal{F}_{\_}{T}(X)(pa_{\_}X)$ . Otherwise, 2) $pa_{\_}X\notin dom(\mathcal{F}_{\_}T(X))$ , but there is an assignment $s\in T^{-}$ such that $s(PA_{\_}X)=pa_{\_}X$ ; in this case we set $\mathcal{F}_{\_}{T^{\prime}}(X)(pa_{\_}X):=s(X)$ (i.e., we transfer information from the team component $T^{-}$ to the function component $\mathcal{F}_{\_}{T^{\prime}}$ ). Otherwise, 3) we define $\mathcal{F}_{\_}{T^{\prime}}(X)(pa_{\_}X)$ to be the formal term $\hat{f}_{\_}X(pa_{\_}X)$ . (Cf. example 6.1 for a justification of the three cases).

At this point, the algorithm $do(\mathbf{X}=\mathbf{x})$ described in section 6 can be applied, and it will produce a causal team, some of whose entries may consist of formal terms. In the next section we will sketch some ideas for the usage of these causal teams as semantical objects for formal languages.

8.2 The (fully defined) nonrecursive case

In case a causal team is not recursive (i.e., its graph is cyclic), the algorithm above may well fail to terminate. However, if the causal team is fully defined and satisfies some further constraints, we can still find reasonable (but not necessarily computable) notions of intervention. One such constraint was isolated by Galles&Pearl ([7]): they considered the case of systems of structural equations with unique solutions, defined as follows: 1) for fixed values of the exogenous variables, the system has a unique solution, and 2) each “intervened” system of equations obtained from the initial one by replacing some equations of the form $X:=f_{\_}X(PA_{\_}X)$ with constant equations $X:=x$ still has a unique solution for each choice of values for the exogenous variables. Since causal teams encode in an obvious way a system of modifiable structural equations, we can as well define causal teams with unique solutions. In this case, the natural way to define an intervention $do(X=x)$ on the team is to replace each assignment $s\in T^{-}$ with the (unique) assignment $t$ which encodes the solution of the intervened system for the choice $s(\mathbf{U})$ of values for the exogenous variables131313In case the intervention acts also on some of the exogenous variables, this idea should be modified in an obvious way.. The definition of the other components of the causal team produced by the intervention is straightforward. We do not expect any significant differences in the logical features of (fully defined) nonrecursive causal teams with unique solutions in comparison to their recursive relatives.

Analogous definitions could be given of causal teams with at most unique solutions and of interventions over them (the idea being that, whenever a modified structural equation system admits no solution for $s(\mathbf{U})$ , the assignment $s$ should be discarded). We expect the corresponding logic to differ significantly from the case of unique solutions.

The general nonrecursive, fully defined case, where multiple solutions are allowed, is problematic. One might choose to add, to the intervened team, all the assignments that correspond to solutions of the modified equations. However, there seems to be no general criterion for deciding whether all such solutions should be given equal probabilistic weight (see next sections); this reflects general problems in the interpretation of nonrecursive causal systems ([22]). A second option might be to model such an intervention as producing not one, but multiple teams, corresponding to possible different outcomes of the intervention. This set of “accessible teams” would then induce a nontrivial modality, making it reasonable to treat counterfactuals as necessity operators in a dynamic logic setting (in the spirit of [11]).

9 Falsifiability and admissibility

Interventions, when applied to a (recursive) partially defined causal team, can generate teams with formal entries. How should we evaluate statements which involve variables whose columns are not filled with proper values? Usually, we cannot ascertain their truth; e.g., we cannot assert $Y=3$ in a team whose non-formal entries for $Y$ are all equal to $3$ . Yet, in some cases we might be able to observe the falsity of such statements; i.e., to state their contradictory negation. Let us write $\downarrow s(X)$ to signify that $s(X)$ is a value, not a formal term. Let $T$ be a team, possibly with formal entries. We read $T\models^{f}\psi$ as “ $\psi$ is falsifiable in $T$ ”. We propose the clauses:

•

$T\models^{f}X=x\text{ (resp.$ X\neq x $)}\text{ if there is }s\in T^{-}\text{ such that }\downarrow s(X)\text{ and }s(X)\neq x$ (resp. $s(X)=x$ )

•

$T\models^{f}=({\mathbf{X}};{Y})\text{ if there are }s,s^{\prime}\in T^{-}\text{ such that }s(\mathbf{X})=s^{\prime}(\mathbf{X}),\downarrow s(Y),\downarrow s^{\prime}(Y)\text{ and }s(Y)\neq s^{\prime}(Y)$

•

$T\models^{f}\psi\land\chi$ if $T\models^{f}\psi$ or $T\models^{f}\chi$

•

$T\models^{f}\psi\lor\chi$ if for all subteams $T_{\_}1,T_{\_}2$ of $T$ with $T_{\_}1^{-}\cup T_{\_}2^{-}=T^{-}$ , we have $T_{\_}1\models^{f}\psi$ or $T_{\_}2\models^{f}\chi$

•

$T\models^{f}\mathbf{X}=\mathbf{x}\hskip 2.0pt\Box\rightarrow\chi$ if $T_{\_}{\mathbf{X}=\mathbf{x}}\models^{f}\chi$ .

Coming up with a clause for selective implication is less straightforward; we propose the following. Given $\psi$ $\mathcal{CO}$ formula, let $\mathbf{V}^{\psi}$ be the set of variables occurring in $\psi$ ; define $T^{\psi}_{\_}*:=T^{\psi}\cup\{s\in T^{-}|\not\downarrow s(V)\text{ for some }V\in\mathbf{V}^{\psi}\}$ . Then:

•

$T\models^{f}\psi\supset\chi$ if $T^{\psi}_{\_}*\models^{f}\chi$

As a justification for this clause, consider the team

[TABLE]

It seems unreasonable to assert that this team falsifies the formula $Y=1\supset X=2$ , because, as long as we do not have full knowledge of the function $f_{\_}Y$ , we cannot decide whether $\hat{f}_{\_}Y(1)$ is meant to denote $1$ or some other number; therefore, we do not know whether the second assignment is compatible or not with our selection - if it were, then the formula would be falsified, otherwise it would not be. We opt for the more cautious alternative.

We might also want to assert that some proposition is admissible in the team, that is, consistent with the data we possess. The following seem to be reasonable clauses for the atomic formulas:

•

$T\models^{a}X=x$ (resp. $X\neq x$ ) if for all $s\in T^{-}$ such that $\downarrow s(X),s(X)=x$ (resp. $s(X)\neq x$ )

•

$T\models^{a}=({\mathbf{X}};{Y})\text{ if for all }s,s^{\prime}\in T^{-}\text{ such that }\downarrow s(Y),\downarrow s^{\prime}(Y),s(\mathbf{X})=s^{\prime}(\mathbf{X})$ , we have $s(Y)=s^{\prime}(Y)$ .

We do not consider the general case; but we still give clauses for “classical” formulas in disjunctive normal form:

•

$T\models^{a}\bigvee_{\_}{i=1..m}\bigwedge_{\_}{j=1..n(i)}P^{i}_{\_}j$ ( $P^{i}_{\_}j$ being of the form $X^{i}_{\_}j=x^{i}_{\_}j$ or $X^{i}_{\_}j\neq x^{i}_{\_}j$ ) if there are subteams $T_{\_}i$ of $T$ , for $i=1..m$ , such that

$T_{\_}i\models^{a}P^{i}_{\_}j$ , for all $j=1..n(i)$ . 2. 2.

for each $j,j^{\prime}=1..n(i)$ , if $j\neq j^{\prime}$ , $P^{i}_{\_}j$ is $X^{i}_{\_}j=a$ and $P^{i}_{\_}{j^{\prime}}$ is $X^{i}_{\_}{j^{\prime}}=b$ (with $a\neq b$ ), then for all $s\in T_{\_}i^{-}$ , $s(X^{i}_{\_}j)\neq s(X^{i}_{\_}{j^{\prime}})$ . 3. 3.

for each $j,j^{\prime}=1..n(i)$ , if $j\neq j^{\prime}$ , $P^{i}_{\_}j$ is $X^{i}_{\_}j=a$ and $P^{i}_{\_}{j^{\prime}}$ is $X^{i}_{\_}{j^{\prime}}\neq a$ , then for all $s\in T_{\_}i^{-}$ , $s(X^{i}_{\_}j)\neq s(X^{i}_{\_}{j^{\prime}})$ .

The clauses 2. and 3. above refer to formal inequality between terms. To have an idea of the intuition behind clause 2., the reader may think, for example, of the problem of checking the admissibility of $X=1\land Y=2$ ; imagine that there is a row in which both the $X$ -column and the $Y$ -columm contain the formal term $f(3,g(2))$ ; then, surely, the formula is not admissible (for $X=1\land Y=2$ to hold in the team, it is necessary that the $X$ and $Y$ -column differ on each row). Clause 3. has a similar rationale.

If we restrict attention to causal teams that are generated by interventions applied to causal teams without formal entries, clause 2. and 3. can be omitted, because in this case the same formal term cannot occur in distinct columns of the intervened causal team (since, say, all formal terms in the $X$ -column are of the form $\hat{f}_{\_}X(\dots)$ , while all formal terms in the $Y$ -column are of the form $\hat{f}_{\_}Y(\dots)$ ).

10 Introducing probabilities

Probabilistic notions of causation have been extensively studied in the literature. Bayesian networks formulate causal relations in terms of conditional probabilities on (typically acyclic) graphs enriched with a joint probability distribution over the variables of the graph (Pearl [19], Spirtes, Glymour and Scheines [21]). Woodward also considers interventions on a variable that cause changes in the probability of another variable. In the context of team semantics, probabilities have been recently introduced via the notion of multiteam. A multiteam differs from a team in that it may feature multiple copies of the same assignment; it is therefore closer to a collection of experimental data than teams are. There have been at least two different approaches to the formalization of multiteams in the literature ([25], [6]). For simplicity, we simulate multiteams by means of teams. This can be accomplished by assuming that each team has an extra variable Key (never mentioned in the object languages) which assumes distinct values on distinct assignments of the same team. In this way, we can have two assignments that agree on all the significant variables and just differ on Key. With this assumption, the definition of causal multiteam can follow word by word the definition of causal team.

If we wish to talk about probabilities, it is natural to allow for more atomic formulas.

Def 10.1.

The set of probabilistic literals is given by:

[TABLE]

where $\alpha$ is a probabilistic literal, $\chi,\theta$ are formulas of $\mathcal{CO}$ and $\epsilon\in\mathbb{R}\cap[0,1]$ . Literals and probabilistic literals without negation will be called atomic formulas.

The (basic) probabilistic causal language ( $\mathcal{PCD}$ ) is given by the following clauses:

[TABLE]

*where $\alpha$ is a literal or probabilistic literal, $\psi,\chi$ are $\mathcal{PCD}$ formulas, and $\theta$ a $\mathcal{CO}$ formula. *

The symbols $\sim$ stands for contradictory negation ( $T\models\sim\psi$ iff $T\not\models\psi$ ). We will use abbreviations such as $Pr(\chi)=\epsilon$ for $Pr(\chi)\leq\epsilon\land Pr(\chi)\geq\epsilon$ , or $Pr(\chi)<\epsilon$ for $Pr(\chi)\leq\epsilon\land\sim Pr(\chi)\geq\epsilon$ . The additional connective $\sqcup$ is known as boolean disjunction and its interpretation is given by the clause:

•

$T\models\psi\sqcup\chi\iff T\models\psi$ or $T\models\chi$ .

The statement “either $X=x$ has probability less than one third, or greater than two thirds” should be expressed as $Pr(X=x)<1/3\sqcup Pr(X=x)>2/3$ , and not by means of the earlier disjunction $\lor$ . The reader can verify this point as soon as we give semantical clauses for the probabilistic literals. For any $\mathcal{CO}$ formula $\chi$ and any causal team $T$ with nonempty finite support $T^{-}$ , define the probability of $\chi$ in $T$ as:

[TABLE]

It can be verified that this definition induces a probabilistic space over the subteams of $T^{-}$ that are definable by some $\mathcal{CO}$ formula.

The semantics of probabilistic atoms can then be defined as:

•

$T\models Pr(\chi)\leq\epsilon\iff T^{-}\neq\emptyset\text{ and }Pr_{\_}T(\chi)\leq\epsilon$

•

$T\models Pr(\chi)\leq Pr(\theta)\iff T^{-}\neq\emptyset\text{ and }Pr_{\_}T(\chi)\leq Pr_{\_}T(\theta)$

et cetera141414Notice that, by definition, causal teams with empty support do not satisfy the probabilistic atoms.. It is easy to see that such a logic is not downward closed; for example, a team such that less than half assignments satisfy $\chi$ will satisfy $Pr(\chi)\leq\frac{1}{2}$ ; but the subteam $T^{\chi}$ constituted only of the assignments that satisfy $\chi$ will not satisfy $Pr(\chi)\leq\frac{1}{2}$ .

Can we define conditional probabilities in this kind of framework? Given two $\mathcal{CO}$ formulas $\chi_{\_}1$ and $\chi_{\_}2$ , we write $Pr(\chi_{\_}2|\chi_{\_}1)\leq\epsilon$ as an abbreviation for $\chi_{\_}1\supset Pr(\chi_{\_}2)\leq\epsilon$ . Here is a proof that the abbreviation has the intended meaning: assuming $T^{-}\neq\emptyset$ ,

[TABLE]

and we observe that the left member in this last equation is the usual definition of the conditional probability $Pr_{\_}T(\chi_{\_}2|\chi_{\_}1)$ . In case $T^{-}=\emptyset$ , it is easily proved, instead, that $T\not\models\chi_{\_}1\supset Pr(\chi_{\_}2)\leq\epsilon$ . Things work analogously for inequalities in the opposite direction, and for atoms of the form $Pr(\chi)\leq Pr(\theta)$ .

In the literature (e.g. [19]) one finds ad hoc notations that mix interventions and probabilities; for example, $P(y|do(x),z)=\epsilon$ is used for a probability which is conditional on the outcome of an intervention (post-intervention conditioning); the notation $P(Y_{\_}x|z)=\epsilon$ is used for the probability of a variable after the intervention, conditioned on pre-intervention observations. These two cases are expressed, in $\mathcal{PCD}$ , as $X=x\hskip 2.0pt\Box\rightarrow(Z=z\supset Pr(Y=y)=\epsilon)$ , resp. $Z=z\supset(X=x\hskip 2.0pt\Box\rightarrow Pr(Y=y)=\epsilon)$ ; their difference amounts to a swap in the order of application of $\hskip 2.0pt\Box\rightarrow$ and $\supset$ . Our formalism immediately shows that more varied possibilities could be considered, such as conditioning simultaneously pre- and post-intervention ( $W=w\supset(X=x\hskip 2.0pt\Box\rightarrow(Z=z\supset Pr(Y=y)=\epsilon))$ ) or between two interventions $X=x\hskip 2.0pt\Box\rightarrow(Z=z\supset(W=w\hskip 2.0pt\Box\rightarrow Pr(Y=y)=\epsilon))$ .

11 Direct and total cause

We show that the basic type-causal notions from Woodward ([26]), direct and total cause, can be expressed in our languages, over causal teams which are finite, recursive and fully defined. Quoting from Woodward:

A necessary and sufficient condition for $X$ to be a direct cause of $Y$ with respect to some variable set $\mathbf{V}$ is that there be a possible intervention on $X$ that will change $Y$ (or the probability distribution of $Y$ ) when all other variables in $\mathbf{V}$ besides $X$ and $Y$ are held fixed at some value by interventions. ([26], p.55)

This definition is ambiguous in that it talks about a change in $Y$ , but does not say with respect to what the change is made; to $Y$ ’s actual value? To some possible value of $Y$ , i.e., some $y\in Ran(Y)$ ? We resolve the ambiguity by stipulating that the values of $Y$ to be compared are generated by two distinct interventions.

The kind of intervention that is needed in order to establish whether X is a direct cause of Y is an intervention on all variables in the domain except for $X$ and $Y$ . For example, consider the causal team $T$ in the figure below (with invariant functions $\mathcal{F}_{\_}Z(X):=X$ and $\mathcal{F}_{\_}Y(X,Y):=X+Y$ ). We show that $X$ is a direct cause of $Y$ in $T$ . First of all we must fix all other variables (in this case, just $Z$ ) to an appropriate value (we choose $1$ ) by an intervention, which also removes the arrow that enters in $Z$ , and updates $Y$ :

$T$ :

X Z Y

1 1 2

2 2 4

3 3 6

$\leadsto$ $T_{\_}{Z=1}$ :

X Z Y

1 1 2

2 1 3

3 1 4

Then we intervene in two different ways on $X$ , by $do(X=1)$ and $do(X=2)$ :

[TABLE]

The fact that the two interventions generate distinct values for $Y$ proves that $X$ is a direct cause of $Y$ . The specific form of these kinds of interventions makes it so that, if there is an arrow from $X$ to $Y$ , the intervention enforces a team with constant columns; that is, a singleton causal team is produced.

Let $Fix(\mathbf{z})$ be an abbreviation for $\bigwedge_{\_}{Z\in Dom(T)\setminus\{X,Y\}}Z=z$ . Then, the fact that $X$ is a direct cause of $Y$ in $T$ can be expressed in $\mathcal{CD}$ as follows: $T\models DC(X;Y)$ iff

[TABLE]

In the probabilistic setting, applying the intervention described by $Fix(\mathbf{z})$ does not in general shrink the multiteam to a singleton, because the resulting multiteam may still consist of multiple copies of one and the same assignment. Nevertheless, we can still define direct causation, $T\models PDC(X;Y)$ :

[TABLE]

In a sense, we have a collapse of the probabilistic case to the deterministic one.

We now consider the notion of total cause, following again Woodward:

$X$ is a total cause of $Y$ if and only if there is a possible intervention on $X$ that will change $Y$ or the probability distribution of $Y$ . ([26], p.51)

Applying the kind of intervention described by Woodward, teams do not in general shrink to singletons. However, total cause can be equivalently defined as the existence of such interventions, to be applied after all nondescendants of $X$ have been fixed to some values. We denote by $Fix^{\prime}(\mathbf{w})$ the conjunction that expresses the intervention that fixes all nondescendants $\mathbf{W}$ of $X$ to $\mathbf{w}$ . Such an intervention does shrink the causal team to a singleton, provided there is at least one directed path from $X$ to $Y$ . We can thus express that $X$ is a total cause of $Y$ in $T$ , $T\models TC(X;Y)$ , by the clause:

[TABLE]

A similar definition can be given in the probabilistic language, using the fact that only a finite number of distinct probability values can arise from a finite multiteam.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1]
2[2] Fausto Barbero & Gabriel Sandu (2017): Team Semantics for Interventionist Counterfactuals and Causal Dependence . pre-print, ar Xiv:1610.03406 .
3[3] Rachael Briggs (2012): Interventionist Counterfactuals . Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition 160(1), pp. 139–166, 10.1007/s 11098-012-9908-5 . · doi ↗
4[4] Peter Cameron & Wilfrid Hodges (2001): Some Combinatorics of Imperfect Information . Journal of Symbolic Logic 66, pp. 673–684, 10.2307/2695036 . · doi ↗
5[5] Jukka Corander, Antti Hyttinen, Juha Kontinen, Johan Pensar & Jouko Väänänen (2016): A Logical Approach to Context-Specific Independence . In: Logic, Language, Information, and Computation - 23rd International Workshop, Wo LLIC 2016, Puebla, Mexico, August 16-19th, 2016. Proceedings , pp. 165–182, 10.1007/978-3-662-52921-811 . · doi ↗
6[6] Arnaud Durand, Miika Hannula, Juha Kontinen, Arne Meier & Jonni Virtema (2016): Approximation and Dependence via Multiteam Semantics . In: Proceedings of the 9th International Symposium on Foundations of Information and Knowledge Systems , LNCS 9616, Springer, pp. 271–291, 10.1007/978-3-319-30024-515 . · doi ↗
7[7] David Galles & Judea Pearl (1998): An Axiomatic Characterization of Causal Counterfactuals . Foundations of Science 3(1), pp. 151–182, 10.1023/A:1009602825894 . · doi ↗
8[8] Pietro Galliani (2012): Inclusion and Exclusion Dependencies in Team Semantics - On Some Logics of Imperfect Information . Annals of Pure and Applied Logic 163(1), pp. 68–84, 10.1016/j.apal.2011.08.005 . · doi ↗

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Interventionist Counterfactuals on Causal Teams

Abstract

1 Introduction

2 Structural equation models

Notation 2.1**.**

3 Causal teams

Def 3.1**.**

Example 3.2**.**

4 A basic language for causal teams

Def 4.1**.**

5 Selective implication

Example 5.1**.**

6 Intervention

Example 6.1**.**

Def 6.2**.**

Theorem 6.3**.**

Theorem 6.4**.**

Theorem 6.5**.**

7 The logic of recursive, fully defined causal teams

Theorem 7.1**.**

Theorem 7.2**.**

Theorem 7.3**.**

8 Interventions on more general classes of causal teams

8.1 The (recursive) partially defined case

8.2 The (fully defined) nonrecursive case

9 Falsifiability and admissibility

10 Introducing probabilities

Def 10.1**.**

11 Direct and total cause

Notation 2.1.

Def 3.1.

Example 3.2.

Def 4.1.

Example 5.1.

Example 6.1.

Def 6.2.

Theorem 6.3.

Theorem 6.4.

Theorem 6.5.

Theorem 7.1.

Theorem 7.2.

Theorem 7.3.

Def 10.1.