11institutetext: Saphirion AG, Zug, Switzerland
[email protected]
Foundations for conditional probability
Ladislav Mečíř
Abstract
This article formalizes probability as a function naturally induced
by a plausible preorder on random quantities. It shows that the probability
rules including the Bayes’ rule, are derivable from just this fundamental
characterization.
According to supplementary results, probability is invariably described
as a coherent function and every coherent function can be extended
to a plausibly complete function. Thus, probability can be, without
loss of generality, formalized as a plausibly complete function.
As an illustration, consider a plausibly complete probability P.
Then for every event A and nonzero event C holds that P(A∣C)=0
if A∧C=0 and P(A∣C)=1 if A∧C=C, no matter whether
the unconditional probability P(C) is zero or whether it is defined.
In contrast to that, the common approach using the P(C)P(A∧C)
ratio to define the conditional probability P(A∣C) leaves the probability
plausibly incomplete in general, since it leaves the conditional probability
undefined whenever P(C) is zero or undefined.
Keywords:
probability axioms, conditional probability, random quantities, plausible preorder, coherence
1 Introduction
The probability foundations provided by A. N. Kolmogorov Kolmogorov (1933)
define conditional probability as a ratio of unconditional probabilities.
A. Hájek Hájek (2003) brings several reasons why a more adequate
formalization of conditional probability is needed.
R.T. Cox Cox (1961) contributed a theorem deriving the laws of
conditional probability from a set of postulates. According to J.
Halpern Halpern (1999), M. J. Dupré and F. J. Tipler Dupré, Tipler (2009),
J. B. Paris Paris (2006) as well as other authors, Cox’s approach
is non-rigorous. To be valid, it needs additional assumptions which
are complicated and nontrivial.
B. de Finetti de Finetti (1975) developed the foundations of conditional
probability around the idea of a partially ordered algebra of random
quantities, on which existence of a real-valued function having the
fundamental properties of conditional expectation is postulated.
We take a more general approach. Instead of postulating the existence
of a real-valued function having the fundamental properties of conditional
expectation, we examine a relation on random quantities called a plausible
preorder. We show that a plausible preorder on random quantities naturally
induces a set of conditional preorders. The conditional preorders
naturally induce a conditional expectation that, in general, is a
partial function assigning elements of the extended real line to pairs
consisting of a random quantity and a nonzero event. The conditional
expectation is demonstrated to satisfy a generalized form of probability
rules. In the final section, we provide a formal description of the
notion of coherence and prove that all formalizations of probability
discussed in this article are coherent. Finally, we demonstrate that
a function is coherent if and only if it can be extended to a conditional
expectation naturally induced by a regular plausible preorder.
2 Random quantities
We define the notion of a random quantity using the axiomatic approach
proposed by B. de Finetti de Finetti (1975).
2.1 Definition
Let T denote the set of random quantities. We
postulate that T is a unital associative commutative
algebra over real numbers, i.e. T is a set equipped with
addition, multiplication by real numbers and multiplication, such
that
if X,Y,Z∈T, then (X+Y)+Z=X+(Y+Z)
(associativity of addition)
if X,Y∈T, then X+Y=Y+X (commutativity of addition)
there exists an element 0∈T, such that X+0=X
for every X∈T (identity element of addition)
if X∈T, then there exists an element −X∈T
such that 0=X+(−X) (inverse elements of addition)
if r,s are real numbers and X∈T, then (rs)X=r(sX)
(compatibility of multiplication by real numbers with real multiplication)
if X∈T, then 1X=X (identity element of multiplication
by real numbers)
if r is a real number and X,Y∈T, then r(X+Y)=rX+rY
(distributivity of multiplication by real numbers with respect to
addition)
if r,s are real numbers and X∈T, then (r+s)X=rX+sX
(distributivity of multiplication by real numbers with respect to
real addition)
if X,Y,Z∈T, then (X.Y).Z=X.(Y.Z)
(associativity of multiplication)
if X,Y∈T, then X.Y=Y.X (commutativity of multiplication)
if X,Y,Z∈T, then (X+Y).Z=X.Z+Y.Z (distributivity)
if r,s are real numbers and X,Y∈T, then (rX).(sY)=(rs)(X.Y)
(compatibility with multiplication by real numbers)
there exists an element 1∈T, such that X.1=X
for every X∈T (identity element of multiplication)
2.2 Canonical embedding of real numbers
Per 2.1, T has got an identity element of
multiplication that we can denote 1. We define a function
F from the set of real numbers R to T
such that F(r)=r1 for every real number r. Defined
this way, F is a map embedding the set of real numbers in T.
We call this embedding the canonical embedding of real numbers
in T. Using the canonical embedding of real numbers,
instead of writing r1∈T for a real r, we
simply write r∈T from now on.
2.3 Motivational example
Alice is going to throw a coin in the presence of a notary. Bob knows
that Carol shall pay him a specific amount SH if Alice throws
heads, and a specific amount ST if Alice throws tails.
Bob conceives a set T containing pairs of real numbers
(XH,XT) and defines
addition: (XH,XT)+(YH,YT)=(XH+YH,XT+YT)
multiplication by real numbers: r(XH,XT)=(rXH,rXT)
multiplication: (XH,XT).(YH,YT)=(XHYH,XTYT)
Bob’s T with these operations is unital, since (1,1)
is its identity element with respect to multiplication, associative
and commutative algebra over reals. Per 2.1, the elements
of T are random quantities. Denoting H=(1,0) and T=(0,1)
and using the canonical embedding of real numbers, H+T=1 and H.T=0.
Carol’s payment is represented by random quantity S=(SH,ST)=SHH+STT.
3 Events
3.1 Definition
Let A be a random quantity. We say that A is an event
if it is idempotent, i.e. if A.A=A. We denote the set of events
E(T) and the set of nonzero events E0(T).
On E(T) we define
negation, for event A its negation ¬A is defined as 1−A
conjunction, for events A,B their conjunction A∧B is defined
as A.B
disjunction, for events A,B their disjunction A∨B is defined
as A+B−A.B
With these operations,
1 is the unit element of conjunction, i.e. if A is an event,
then A∧1=1∧A=A
[math] is the unit element of disjunction, i.e. if A is an event,
then A∨0=0∨A=A and
E(T) is a Boolean algebra.
3.2 Natural order
Let A be a Boolean algebra and A,B be its elements.
The natural order on A is defined so that A≤B
if A∧B=A.
Then
the natural order is a partial order,
the minimal element of A in the natural order is 0 and
the maximal element of A in the natural order is 1.
Since E(T) is a Boolean algebra, there is a
natural order on E(T).
3.3 Atoms
Let A be a Boolean algebra. We say that D is an atom
of A, if D∈A and D is a minimal nonzero
element of A in the natural order.
We say that A is atomic if for every nonzero element
A∈A there is an atom D∈A such that D≤A
in the natural order.
3.4 Motivational example
Consider the algebra T defined in 2.3.
Then
E(T)={0,H,T,1}
E0(T)={H,T,1}
the atoms of E(T) are H and T
and
E(T) is atomic.
3.5 Positive combinations of nonzero events
Let n≥1, p1,…,pn be positive real numbers and C1,…,Cn
be nonzero events. Then 0=∑i=1npiCi.
4 Plausible preorder
4.1 Definition
We say that a relation ≲ on T is a plausible
preorder if it has these properties:
4.1.1 Plausible property
If A is an event, then 0≲A.
4.1.2 Additive property
If 0≲X and 0≲Y, then 0≲X+Y.
4.1.3 Multiplicative property
If 0≲X and q is a nonnegative real number, then 0≲qX.
4.1.4 Extension property
X≲Y if and only if 0≲Y−X.
4.2 Motivational example
In the algebra T described in 2.3, Bob
defines
0≲X=(XH,XT) if 0≤XH+XT
X≲Y if 0≲Y−X
It is easy to verify that ≲ is a plausible preorder.
4.3 The greatest plausible preorder
Relation ≲=T×T is a plausible preorder
and the greatest relation* *on T with respect to
inclusion.
4.4 Properties
4.4.1 Reflexivity
A plausible preorder is reflexive.
4.4.2 Transitivity
A plausible preorder is transitive.
4.4.3 Relation to the natural order of events.
A plausible preorder contains the natural order of events as its subset.
4.4.4 Relation to the order of real numbers.
A plausible preorder contains the order of real numbers as its subset.
4.4.5 Intersection of a set of plausible preorders
If X is a nonempty set containing plausible preorders,
then ⋂X is a plausible preorder.
4.4.6 The smallest plausible preorder containing a relation
If R is a relation on T, then there is a relation
≲ that is the smallest plausible preorder with respect to
inclusion containing R.
4.4.7 Subadditivity
If A1,…,An are events and ≲ is a plausible
preorder, then
[TABLE]
5 Plausible equivalence
5.1 Definition
Let ≲ be a plausible preorder. We say that a relation ∼
is the equivalence part of ≲ if for any X,Y holds
that X∼Y if and only if (X≲Y)∧(Y≲X).
We say that a relation ∼ on T is a plausible
equivalence if there is a plausible preorder ≲ such that
∼ is its equivalence part.
5.2 Motivational example
Let ≲ be the plausible preorder defined in 4.2.
Then 0∼X=(XH,XT) if and only if 0=XH+XT.
5.3 Fundamental properties
5.3.1 Plausible property
Let p1,…,pn be positive real numbers, A1,…,An
be events and 0∼∑i=1npiAi. Then 0∼Ai
for every i∈{1,…,n}.
5.3.2 Reflexivity
0∼0.
5.3.3 Additive property
If 0∼X and 0∼Y, then 0∼X+Y.
5.3.4 Multiplicative property
If 0∼X and r is a real number, then 0∼rX.
5.3.5 Extension property
X∼Y if and only if 0∼Y−X.
5.4 Sufficiency of the fundamental properties
Every relation ∼ having the fundamental properties of a plausible
equivalence is a plausible equivalence.
6 Plausible strict partial order
6.1 Definition
Let ≲ be a plausible preorder. We say that a relation ⋦
is the strict part of ≲ if for any X,Y holds that
X⋦Y if and only if (X≲Y)∧¬(Y≲X).
We also say that a relation ⋦ is a plausible strict
partial order if there is a plausible preorder ≲ such that
⋦ is its strict part.
6.2 Motivational example
Let ≲ be the plausible preorder defined in 4.2.
Then 0⋦X=(XH,XT) if and only if 0<XH+XT.
6.3 Fundamental properties
6.3.1 Plausible property
If A is an event, ¬(0⋦A) and 0⋦X, then 0⋦X+A
and 0⋦X−A.
6.3.2 Antireflexivity
¬(0⋦0).
6.3.3 Additive property
If 0⋦X and 0⋦Y, then 0⋦X+Y.
6.3.4 Multiplicative property
If 0⋦X and p is a positive real number, then 0⋦pX.
6.3.5 Extension property
X⋦Y if and only if 0⋦Y−X.
6.4 Sufficiency of the fundamental properties
Every relation ⋦ having the fundamental properties of a plausible
strict partial order is a plausible strict partial order.
7 Conditional preorder
7.1 Definition
Let ≲ be a plausible preorder and C be an event. We define
the conditional preorder ≲C so that X≲CY
if X.C≲Y.C.
7.2 Properties
a conditional preorder is a plausible preorder
≲1 is identical with ≲
≲0 is the greatest plausible preorder
⋦0 is empty, i.e. there are no random quantities X,Y
such that X⋦0Y
8 Regularity of a plausible preorder
8.1 Definition
We say that a plausible preorder ≲ is
degenerate if 0∼1
regular if for every nonzero event C holds 0⋦C
8.2 Motivational example
The plausible preorder ≲ defined in 4.2
is regular.
8.3 Properties
A conditional preorder ≲C is degenerate if and only if
0∼C.
The greatest plausible preorder is degenerate.
If a plausible preorder is degenerate, then for every pair of random
quantities X,Y in the linear span of E(T)
holds X∼Y. In particular, for every pair of real numbers r,s
holds r∼s, and for every pair of events A,B holds A∼B.
A plausible preorder is nondegenerate if and only if it coincides
with the order of real numbers on R.
9 Extended real line
9.1 Definition
We define the extended real line as the set R=R∪{−∞,+∞},
where R is the set of real numbers.
9.2 Order
We extend the order of real numbers to R so
that −∞≤x≤+∞ for every x∈R,
turning R into a linearly ordered set. In this
order, every subset U of R has both the
least upper bound (supremum) denoted supU and the greatest
lower bound (infimum) denoted infU. In particular,
sup∅=−∞
inf∅=+∞
supR=supR=+∞
infR=infR=−∞
9.3 Arithmetic
We extend the artithmetic operations on real numbers to R
so that
if x=−∞, then (+∞)+x=x+(+∞)=+∞
if x=+∞, then (−∞)+x=x+(−∞)=−∞
if x>0, then x.(+∞)=(+∞).x=+∞
if x>0, then x.(−∞)=(−∞).x=−∞
if x<0, then x.(+∞)=(+∞).x=−∞
if x<0, then x.(−∞)=(−∞).x=+∞
if x is a real number, then +∞x=−∞x=0
if x is a positive real number, then x+∞=+∞
if x is a positive real number, then x−∞=−∞
if x is a negative real number, then x+∞=−∞
if x is a negative real number, then x−∞=+∞
Other expressions than the above are undefined. For example, the expressions
(+∞)+(−∞)
(−∞)+(+∞)
0.(+∞)
0.(−∞)
(+∞).0
(−∞).0
0x
+∞+∞
−∞+∞
+∞−∞
−∞−∞
are all undefined.
10 Expectation naturally induced by a plausible preorder
10.1 Definition
Let ≲ be a plausible preorder and X be a random quantity.
Denoted E(X), the expectation of X (more precisely,
the expectation of X naturally induced by ≲)
is
a real number x, if for every positive real number ϵ
holds −ϵ⋦X−x⋦ϵ
+∞, if for every real number y holds y⋦X
−∞, if for every real number y holds X⋦y
not defined, if none of the above holds
10.2 Motivational example
Consider the plausible preorder defined in example 4.2.
For every random quantity X=(XH,XT) holds that E(X)=21XH+21XT.
In particular, E(H)=E(T)=21.
10.3 Relation to regularity of plausible preorder
Let ≲ be a plausible preorder and r be a real number.
if 0⋦1, then E(r)=r
if 0∼1, then E(r) is not defined
if ≲ is the maximal plausible preorder and X is a random
quantity, then E(X) is not defined
10.4 Preorder consistency
Let r be a real number and X,Y be random quantities. 10.1
implies that
if E(X) exists and r<E(X), then r⋦X
if E(X) exists and E(X)<r, then X⋦r
if both E(X) and E(Y) exist and E(X)<E(Y), then X⋦Y
if both E(X) and E(Y) exist and X≲Y, then E(X)≤E(Y)
10.5 Existence and uniqueness
Let ≲ be a plausible preorder and X be a random quantity.
Then X has expectation if and only if in the extended real line
R holds that sup{r∈R∣r⋦X}=inf{r∈R∣X⋦r}.
In such case, E(X)=sup{r∈R∣r⋦X}=inf{r∈R∣X⋦r}.
11 Conditional expectation
11.1 Definition
Let ≲ be a plausible preorder, X be a random quantity
and C be an event. Denoted E(X∣C), the conditional expectation
of X given C is defined using 10.1 as
the expectation of X naturally induced by the conditional preorder
≲C.
11.2 Motivational example
Consider the plausible preorder defined in 4.2. For
every random quantity X=(XH,XT) holds that E(X∣H)=XH
and E(X∣T)=XT.
11.2.1 Relation to regularity of plausible preorder
Let A,C be events and r be a real number. Then
if 0⋦C and r is a real number, then E(r∣C)=E(rC∣C)=r
if 0∼C then neither E(r∣C) nor E(rC∣C) is defined
11.3 As a function
Per 10.3 and 10.5, conditional
expectation is a partial function from T×E0(T)
to R.
12 Rules
Let ≲ be a plausible preorder, E(X∣C) be the conditional
expectation naturally induced by ≲, X,Y be random quantities,
B,C,D be events and r be a real number. The rules the conditional
expectation follows are:
12.1 Consistency
E(X∣C) exists if and only if E(X.C∣C) exists. In case it exists,
[TABLE]
12.2 Real additivity
If E(X∣C) exists, then
[TABLE]
12.3 General additivity
If the expression E(X∣C)+E(Y∣C) makes sense, then
[TABLE]
12.4 Homogeneity
If the expression rE(X∣C) makes sense, then
[TABLE]
12.5 Conditional probability
If E(C∣D) exists, we, compatibly with Thomas Bayes Bayes (1763),
denote
[TABLE]
and say that it is the conditional probability of C given
D. For P(C∣1) we also use a simpler notation P(C).
12.6 Monotonicity
If both P(B∣D) and P(C∣D) exist and B≤C in the natural
order of events, then
[TABLE]
12.7 Minimal and maximal probability
If P(C∣D) exists, then
[TABLE]
12.8 Completeness
If P(B∣D)=0 and C≤B in the natural order of events, then
[TABLE]
If P(B∣D)=1 and B≤C in the natural order of events, then
[TABLE]
12.9 Subadditivity
If A1,…,An are events and all of P(A1∣D),…,P(An∣D),P(⋁i=1nAi∣D)
exist, then
[TABLE]
12.10 Bayes’ rule
12.10.1 Chain form
If the expression E(X∣C.D).P(C∣D) makes sense, then
[TABLE]
12.10.2 Chain form, zero E(X|C.D)
If E(X∣C.D)=0, then
[TABLE]
12.10.3 Chain form, zero P(C|D)
If P(C∣D)=0 and there is a real number p such that
−p≲C.DX≲C.Dp, then
[TABLE]
12.10.4 Chain form, infinite E(X|C.D)
If E(X∣C.D)∈{−∞,+∞} and there is a positive
real number p such that p≲DC, then
[TABLE]
12.10.5 Conditional form
If the expression P(C∣D)E(X.C∣D) makes sense, then
[TABLE]
12.10.6 Conditional form, zero E(X.C|D)
If E(X.C∣D)=0 and there is a positive real number p
such that p≲DC, then
[TABLE]
12.10.7 Conditional form, infinite E(X.C|D)
If E(X.C∣D)∈{−∞,+∞}, then
[TABLE]
12.10.8 Conditional form, zero P(C|D)
If P(C∣D)=0 and there is a real number p such that 1≲DpX.C,
then p=0 and
[TABLE]
12.10.9 P form
If the expression E(X∣C.D)E(X.C∣D) makes sense, then
[TABLE]
12.10.10 P form, zero E(X.C|D)
If E(X.C∣D)=0 and there is a real number p such that
1≲C.DpX, then
[TABLE]
12.10.11 P form, infinite E(X|C.D)
If E(X∣C.D)∈{−∞,+∞} and there
is a real number p such that −p≲DX.C≲Dp,
then
[TABLE]
13 Coherence
13.1 Definition
We say that PV is a coherent function if it is a partial
function from T×E(T)
to R such that if
n≥0,m≥1 are integers
q1,…,qn are nonnegative real numbers
r1,…,rm, s1,…,sm are real numbers
C1,…,Cn, D1,…,Dm are events
X1,…,Xm are random quantities
rj(PV(Xj∣Dj)+sj)>0 for every j∈{1,…,m}
then
[TABLE]
13.2 Kolmogorovian plausible values
Let R≥0 denote the set of nonnegative real numbers.
We say that PV is a Kolmogorovian plausible value if
PV is a function from F to R≥0,
where F is a nonempty subset of E(T)
closed under negation and conjunction
PV(1)=1 (unitarity)
if A,B∈F and A.B=0, then PV(A+B)=PV(A)+PV(B)
(additivity)
Using the notation PV(A∣1)=PV(A), we can handle every Kolmogorovian
plausible value as a function from F×{1} to R≥0.
13.2.1 Coherence
Every Kolmogorovian plausible value is coherent.
13.3 Coxian plausible values
We say that PV is a Coxian plausible value if
PV is a function from F×F0 to R≥0,
where F is a nonempty subset of E(T)
closed under negation and conjunction and F0 is the
set containing all elements of F except for [math]
if C∈F0, then PV(C∣C)>0 (positivity)
if A∈F and C∈F0, then PV(1−A∣C)=1−PV(A∣C)
(negation formula)
if A,C,D are elements of F and C.D=0, then PV(A.C∣D)=PV(A∣C.D).PV(C∣D)
(Bayes’ rule)
13.3.1 Basic properties
Let PV be a function from F×F0 to
R≥0 that satisfies definition 13.3,
A∈F and C∈F0. Then
PV(C∣C)=1
PV(1∣C)=1
PV(0∣C)=0
PV(A.C∣C)=PV(A∣C)
13.3.2 Sum rule
Let PV be a function from F×F0 to
R≥0 that satisfies definition 13.3.
Let A,B∈F such that A.B=0 and let C∈F0.
Then
[TABLE]
13.3.3 Subadditivity
Let PV be a function from F×F0 to
R≥0 that satisfies definition 13.3.
Let A1,…,An∈F and let C∈F0.
Then
[TABLE]
13.3.4 Coherence
Every Coxian plausible value is coherent.
13.4 Dupré-Tiplerian plausible values
We say that PV is a Dupré-Tiplerian plausible value if
PV is a partial function from T×C to
R, where C is a subset of E0(T)
closed under disjunction
if A is an event and C∈C, then PV(A∣C) exists
and PV(A∣C)≥0 (nonnegativity)
if C∈C, then PV(C∣C)>0 (positivity)
if r∈R, C∈C, X is a random quantity
and PV(X∣C) exists, then PV(rX∣C)=r.PV(X∣C) (homogeneity)
if C∈C, X, Y are random quantities and both PV(X∣C)
and PV(Y∣C) exist, then PV(X+Y∣C)=PV(X∣C)+PV(Y∣C) (additivity)
if C.D∈C, D∈C, X is a random quantity
and PV(X∣C.D) exists, then PV(X.C∣D)=PV(X∣C.D).PV(C∣D) (Bayes’
rule)
13.4.1 Subadditivity
Let PV be a function from T×C to R
that satisfies definition 13.4. Let A1,…,An
be events and let C∈C. Then
[TABLE]
13.4.2 Coherence
Every Dupré-Tiplerian plausible value is coherent.
13.5 Characterizations
Let PV be a partial function from T×E(T)
to R. Then the following characterizations
are equivalent:
-
PV is coherent
2. 2.
PV can be extended to conditional expectation naturally induced
by a regular plausible preorder
3. 3.
PV can be extended to conditional expectation naturally induced
by a plausible preorder
13.6 Probability as a plausibly complete function
We say that a function is plausibly complete, if it is naturally
induced by a regular plausible preorder. According to 13.2.1,
13.3.4, 13.4.2 and 13.5,
we can, without loss of generality, characterize probability as a
plausibly complete function.
14 Conclusion
Assigning the role of a primitive notion to the notion of a plausible
preorder, our formalization offers a different perspective on the
foundations of probability than the formalizations discussed in the
introduction. Our approach neither forces us to define conditional
probability by a ratio of unconditional probabilities which is criticized
as inadequate, nor does it force us to postulate conditional probability
to have other properties open to doubt. The formalization is supported
by theorem 13.5, confirming that it encompasses
all coherent instances of probability. We supplement it by verifying
that according to all formalizations of the probability notion discussed
in the introduction, probability is coherent. To illustrate that our
formalization satisfies the main Hájek’s Hájek (2003) requirements,
consider a nonzero event C such that P(C) is either
zero or undefined. Because of that, the ratio P(C)P(A∧C)
leaves the conditional probability P(A∣C) undefined.
On the other hand, once probability is coherent, theorem 13.5
confirms that it can be extended to a plausibly complete function,
i.e. to a conditional expectation naturally induced by a regular plausible
preorder. Definition 11.1 applied to a regular plausible
preorder yields that P(A∣C)=0 if A∧C=0 and
P(A∣C)=1 if A∧C=C, no matter whether P(C)
is zero or whether it is defined.
15 Appendix
15.1 Proof of 3.5
Let n≥1, p1,…,pn be positive real numbers and
C1,…,Cn be nonzero events. Let A be the
Boolean subalgebra of E(T) generated
by C1,…,Cn. Since A is finitely generated,
it is finite and atomic Givant, Halmos (2008). Since C1 is nonzero,
there is an atom D of A such that D≤C1 in
the natural order, i.e. C1.D=D. Since D is an atom, for every
i∈{2,…,n} either Ci.D=D or Ci.D=0. Therefore,
(∑i=1npiCi).D=p1D+∑i=2npiCi.D=pD,
where p≥p1 is a real number. Since D is nonzero and p
is a positive real number, pD is nonzero. Since (∑i=1npiCi).D
is nonzero, ∑i=1npiCi is nonzero.
15.2 Proof of 4.4.7
Let A1,…,An be events and ≲ be a plausible
preorder. Due to reflexivity of ≲, the inequality ⋁i=1nAi≲∑i=1nAi
holds for n=0 and n=1. Let the inequality ⋁i=1nAi≲∑i=1nAi
hold for some integer n≥1 and arbitrary events A1,…,An.
Let A1,…,An+1 be events. Define events B1=A1,…,Bn−1=An−1,Bn=An∨An+1.
Then ⋁i=1n+1Ai=⋁i=1nBi≲∑i=1nBi=∑i=1n+1Ai−An.An+1≲∑i=1n+1Ai.
By mathematical induction the inequality holds for every integer n.
15.3 Proof of 5.3.1
Let ∼ be the equivalence part of a plausible equivalence ≲,
let p1,…,pn be positive real numbers, A1,…,An
be events and 0∼∑i=1npiAi. Without loss of generality,
we assume that n≥1 and prove that 0∼A1. Per the assumption,
∑i=1npiAi≲0. Per 4.1, −∑i=2npiAi≲0.
Therefore, A1=p11(∑i=1npiAi−∑i=2npiAi)≲0.
Together with the plausible property of ≲ guaranteeing that
0≲A1, we obtain that 0∼A1.
15.4 Proof of 5.4
Let ∼ be a relation having the fundamental properties of a plausible
equivalence. We define a relation ≲ so that 0≲X
if X=U+∑i=1nqiAi for some random quantity U∼0,
integer n≥0, nonnegative real numbers q1,…,qn
and events A1,…,An. We also define that X≲Y
if 0≲Y−X.
The task to verify that the relation ≲ defined this way
is a plausible preorder and that ∼ is its equivalence part,
is left as an exercise to the reader. Note also that the relation
≲ defined this way is the smallest plausible preorder with
respect to inclusion such that ∼ is its equivalence part.
15.5 Proof of 6.4
Let ⋦ be a relation having the fundamental properties of a
plausible strict partial order. Let X be a random quantity. We
define that 0≲X if either 0⋦X or if there are
real numbers r1,r2,…,rn and events A1,A2,…,An
such that X=∑i=1nriAi and for every i∈1,2,…,n
holds ¬(0⋦Ai). We also define that X≲Y if
0≲Y−X.
The task to verify that the relation ≲ defined this way
is a plausible preorder and that ⋦ is its strict part, is
left as an exercise to the reader. Note also that the relation ≲
defined this way is the smallest plausible preorder with respect to
inclusion such that ⋦ is its strict part.
15.6 Proof of 10.2
Let X=(XH,XT). Then X−(21XH+21XT)=(21XH−21XT)H+(21XT−21XH)T.
Since (21XH−21XT)+(21XT−21XH)=0,
4.2 implies that X−(21XH+21XT)∼0.
Let ϵ be a positive real number. Per 4.2,
0⋦ϵ. Therefore, −ϵ⋦0≲X−(21XH+21XT)≲0⋦ϵ.
15.7 Proof of 10.5
Let ≲ be a plausible preorder and X be a random quantity.
If E(X)=+∞, then in the extended real line, sup{r∈R∣r⋦X}=supR=+∞=inf∅=inf{r∈R∣X⋦r}.
If E(X)=−∞, then sup{r∈R∣r⋦X}=sup∅=−∞=infR=inf{r∈R∣X⋦r}.
If E(X)=x∈R, then per 10.4, for
r<x holds that r⋦X and for x<r holds that X⋦r.
Due to antireflexivity and transitivity of ⋦, there is no
real number r such that both r⋦X and X⋦r. Therefore,
if r⋦X then also r≤x. Similarly, if X⋦r, then
also x≤r. This means that x is an upper bound of {r∈R∣r⋦X}.
x is also the least upper bound of {r∈R∣r⋦X},
since for every real number r smaller than x there is a greater
real number s that is still smaller than x, which means that
r<x is not an upper bound of {r∈R∣r⋦X}. Similarly,
x is the greatest lower bound of {r∈R∣X⋦r}.
Vice versa, let in the extended real line sup{r∈R∣r⋦X}=x=inf{r∈R∣X⋦r}.
If x=+∞, then sup{r∈R∣r⋦X}=+∞,
which means that if y is a real number, then there is a real number
r such that y<r and r⋦X. Per 4.4.4
also y⋦X, which proves that E(X)=+∞=x.
If x=−∞, then inf{r∈R∣X⋦r}=−∞,
which means that if y is a real number, then there is a real number
r such that r<y and X⋦r. Per 4.4.4
also X⋦y, which proves that E(X)=−∞=x.
If x∈R and ϵ is a positive real number, then
since sup{r∈R∣r⋦X}=x, there is a real number
r such that x−ϵ<r⋦X, and due to 4.4.4
also x−ϵ⋦X. Since x=inf{r∈R∣X⋦r},
there is a real number r such that X⋦r<x+ϵ , and
due to 4.4.4 also X⋦x+ϵ.
Per 10.1, E(X)=x.
Let E(X∣C.D)∈{+∞,−∞} and
p be a positive real number such that p≲DC. Per 7.1,
pD≲C.D. Define
[TABLE]
and
[TABLE]
Per 12.4, E(Y∣C.D)=sE(X∣C.D)=+∞.
Let y be a real number. Define z=pmax(0,y).
Then z≥0. Since E(Y∣C.D)=+∞ and z is a
real number, zC.D⋦Y.C.D. Therefore, yD≲max(0,y)D=zpD≲zC.D⋦Y.C.D.
This proves that E(Y.C∣D)=+∞. Since X=sY and
per 12.4, E(X.C∣D)=sE(Y.C∣D)=E(X∣C.D).
Let x=E(X∣C.D) be a real number, c=P(C∣D)
be a real number and let ϵ be a positive real number. Define
δ=1+∣x∣ϵ. Then δ is a positive
real number and since x=E(X∣C.D), −δC.D⋦X.C.D−xC.D⋦δC.D.
In the natural order of events C.D≤D and per 4.4.3,
C.D≲D. Therefore, −δD⋦X.C.D−xC.D⋦δD.
Since c=P(C∣D), −δD⋦C.D−cD⋦δD,
implying that −∣x∣δD≲xC.D−xcD≲∣x∣δD.
Summing inequalities, we get that −(1+∣x∣)δD⋦X.C.D−xcD⋦(1+∣x∣)δD.
Since (1+∣x∣)δD=ϵD, also −ϵD⋦X.C.D−xcD⋦ϵD,
which proves that E(X.C∣D)=xc.
The only case when the formula E(X∣C.D).P(C∣D)
makes sense and at least one of E(X∣C.D), P(C∣D)
is not a real number, is the case when E(X∣C.D)∈{−∞,+∞}
and c=P(C∣D) is a positive real number. In this case,
note that 0<2c<c, use 10.4 to obtain
that 2c⋦DC and use 12.10.4 proven
above to get that E(X.C∣D)=E(X∣C.D)=E(X∣C.D).P(C∣D).
15.10 Proof of 12.10.2
Let ϵ be a positive real number. If E(X∣C.D)=0,
then per 11.1, −ϵC.D⋦X.C.D⋦ϵC.D.
In the natural order of events C.D≤D and per 4.4.3,
C.D≲D. Therefore, −ϵD≲−ϵC.D⋦X.C.D⋦ϵC.D≲ϵD,
which proves that E(X.C∣D)=0.
15.11 Proof of 12.10.3
Let P(C∣D)=0 and p be a real number such that −pC.D≲X.C.D≲pC.D.
Define q=max(1,p). Then both q>0 and q≥p.
Therefore, pC.D≲qC.D. Let ϵ be a positive real
number. Define δ=qϵ. Since P(C∣D)=0,
−δD⋦C.D⋦δD, i.e. also qC.D⋦qδD.
Therefore, −ϵD=−qδD⋦−qC.D≲−pC.D≲X.C.D≲pC.D≲qC.D⋦qδD=ϵD,
which proves that E(X.C∣D)=0.
15.12 Proof of 12.10.7
Let E(X.C∣D)∈{−∞,+∞}. Define
[TABLE]
and
[TABLE]
Per 12.4, E(Y.C∣D)=s.E(X.C∣D)=+∞.
Let y be a real number. Define z=max(0,y). Then
both z≥0 and z≥y. Since E(Y.C∣D)=+∞,
zD⋦Y.C.D. In the natural order of events C.D≤D and
per 4.4.3, C.D≲D. Therefore,
yC.D≲zC.D≲zD⋦Y.C.D, which proves that E(Y∣C.D)=+∞.
Since X=sY and per 12.4, E(X∣C.D)=sE(Y∣C.D)=E(X.C∣D).
15.13 Proof of 12.10.5
Let x=E(X.C∣D) be a real number, c=P(C∣D)
be a nonzero real number. Per 12.7, c>0.
Let ϵ be a positive real number. Define δ=2.(1+c∣x∣)c.min(1,ϵ).
Then 0<δ≤2c. Equality c=P(C∣D), the
fact that δ is positive and 10.1 give −δD⋦C.D–cD⋦δD.
Therefore, 2cD=cD−2cD≲cD−δD⋦C.D,
i.e. D⋦c2C.D. By 4.1.3,
−δc∣x∣D≲−cxC.D+xD≲δc∣x∣D.
Equality x=E(X.C∣D) and 10.1 give −δD⋦X.C.D−xD⋦δD.
Summing the inequalities, we get that −δ(1+c∣x∣)D⋦X.C.D−cxC.D⋦δ(1+c∣x∣)D.
Also, δ(1+c∣x∣)D⋦δ(1+c∣x∣)c2C.D=min(1,ϵ)C.D≲ϵC.D.
Combining these inequalities, we get that −ϵC.D⋦−δ(1+c∣x∣)D⋦X.C.D−cxC.D⋦δ(1+c∣x∣)D⋦ϵD,
which proves that E(X∣C.D)=cx.
The only remaining case when the expression P(C∣D)E(X.C∣D)
makes sense is the case when E(X.C∣D)∈{−∞,+∞}
and P(C∣D) is a positive real number. In this case, use the equality
E(X∣C.D)=E(X.C∣D) from 12.10.7
proven above and the fact that E(X.C∣D)=P(C∣D)E(X.C∣D)
to finally obtain E(X∣C.D)=P(C∣D)E(X.C∣D).
15.14 Proof of 12.10.6
Let E(X.C∣D)=0 and let p be a positive real number
such that pD≲C.D. Let ϵ be a positive real number.
Define δ=ϵp. Since E(X.C∣D)=0 and δ>0,
per 10.1 −δD⋦X.C.D⋦δD, implying
that −ϵC.D≲−ϵpD=−δD⋦X.C.D⋦δD=ϵpD≲ϵC.D,
which proves that E(X∣C.D)=0.
15.15 Proof of 12.10.8
Let P(C∣D)=0 and let p be a real number such that
1≲DpX.C. Defining Y=pX we get 1≲DY.C
and per 11.1, D≲Y.C.D. Let y be a real
number. Define ϵ=max(1,y)1. Then ϵ
is a positive real number and since P(C∣D)=0, −ϵD⋦C.D⋦ϵD.
Therefore, ϵ1C.D⋦D and yC.D≲max(1,y)C.D=ϵ1C.D⋦D≲Y.C.D.
This proves that E(Y∣C.D)=+∞. It also proves that
0⋦Y, implying that Y=0 and since Y=pX also p=0
and X=p1Y. Per 12.4, E(X∣C.D)=p1E(Y∣C.D)=p+∞.
Let E(X∣C.D)∈{−∞,+∞} and
let p be a real number such that −p≲DX.C≲Dp.
Per 7.1, −pD≲X.C.D≲pD.
Define
[TABLE]
[TABLE]
and
[TABLE]
Then q>0 and q≥p. Therefore, −qD≲−pD≲X.C.D≲pD≲qD.
Since Y=sX and s∈{−1,1}, also −qD≲Y.C.D≲qD.
Per 12.4, E(Y∣C.D)=s.E(X∣C.D)=+∞.
Let ϵ be a positive real number. If we define y=ϵq,
then y>0. Since E(Y∣C.D)=+∞, yC.D⋦Y.C.D.
Combining inequalities, we get that yC.D⋦qD. Therefore, C.D⋦yqD=ϵD.
Per 4.4.3, 0≲C.D. Combining
inequalities, we get that −ϵD⋦−C.D≲0≲C.D⋦ϵD,
proving that P(C∣D)=0.
15.17 Proof of 12.10.9
Let v=E(X.C∣D) be a real number and let x=E(X∣C.D)
be a nonzero real number. Let ϵ be a positive real number.
Define δ=2∣x∣ϵ. Then δ
is a positive real number and since v=E(X.C∣D), −δD⋦X.C.D−vD⋦δD.
Since x=E(X∣C.D), −δC.D⋦X.C.D−xC.D⋦δC.D.
In the natural order of events, C.D≤D and per 4.4.3
C.D≲D. Therefore, −δD≲−δC.D⋦−X.C.D+xC.D⋦δC.D≲δD.
Summing inequalities, we get −2δD⋦xC.D−vD⋦2δD,
i.e. −ϵD=−∣x∣2δD⋦C.D−xvD⋦∣x∣2δD=ϵD.
This proves that E(C∣D)=xv.
If E(X.C∣D)∈{−∞,+∞}, then
per 12.10.7, E(X∣C.D)=E(X.C∣D),
i.e. the expression E(X∣C.D)E(X.C∣D)
does not make sense.
Let v=E(X.C∣D) be a real number and let E(X∣C.D)∈{−∞,+∞}.
Define p=1+∣v∣. Per 11,
−1D⋦X.C.D−vD⋦1D, i.e. −pD=−(1+∣v∣)D≲−(1−v)D⋦X.C.D⋦(1+v)D≲(1+∣v∣)D=pD.
This demonstrates that the assumptions of 12.10.11
hold. Therefore, P(C∣D)=0=E(X∣C.D)v.
Let E(X.C∣D)=0 and let p be a real number such that
1≲C.DpX. Define Y=pX. Per 12.4, E(Y.C∣D)=pE(X.C∣D)=0.
Also, 1≲C.DY. Per 7.1, C.D≲Y.C.D.
Let ϵ be a positive real number. Since E(Y.C∣D)=0,
−ϵD⋦Y.C.D⋦ϵD. Combining inequalities,
C.D⋦ϵD. Per 4.1.1,
0≲C.D. Combining inequalities, we get −ϵD⋦−C.D≲0≲C.D⋦ϵD,
proving that P(C∣D)=0.
15.19 Proof of 13.2.1
Let PV be a function from F to R≥0
that satisfies 13.2. Let n≥0, m≥1 be
integers, q1,…,qn be nonnegative real numbers, r1,…,rm,
s1,…,sm be real numbers, C1,…,Cn be events,
A1,…,Am be elements of F and let rj(PV(Aj)+sj)>0
for every j∈{1,…,m}.
Let A be a Boolean algebra generated by A1,…,Am
and B be a Boolean algebra generated by A1,…,Am,C1,…,Cn.
Then A⊆B⊆F. Since
both A and B are finitely generated, they
are finite and atomic Givant, Halmos (2008). Let at(A)
be the set of atoms of A and at(B)
be the set of atoms of B. Let span(B)
be the linear span of B. Let G∈at(B).
If X∈span(B), then there is a unique
real number r such that X.G=rG. This allows us to define a function
φ from span(B)×at(B)
to R such that X.G=φ(X,G)G. The properties
of the function φ are:
If B∈B, then φ(B,G)∈{0,1},
implying that φ(B,G).φ(B,G)=φ(B,G).
(idempotence)
If X,Y∈span(B) and X.Y=0, then
φ(X,G).φ(Y,G)=φ(X.Y,G)=0.
(orthogonality)
If X,Y∈span(B), then φ(X+Y,G)=φ(X,G)+φ(Y,G).
(additivity)
If r is a real number and X∈span(B),
then φ(rX,G)=rφ(X,G). (homogeneity)
Define a function ν from C to R so that
for B∈B, ν(B)=∑G∈at(B)φ(B,G).
The properties of the function ν are
If 0=B∈B, then ν(B)>0.
Define a function F from span(B)
to R so that for X∈span(B),
[TABLE]
The properties of the function F are
If B∈B, then F(B)≥0. (nonnegativity)
F is additive.
F is homogeneous.
F coincides with PV on A.
Therefore, ∑j=1mrj(F(Aj)+sjF(1))=∑j=1mrj(PV(Aj)+sjPV(1))>0.
Also, F(∑i=1nqiCi)≥0 and F(∑i=1nqiCi+∑j=1mrj(Aj+sj))>0.
Since F(0)=0, this proves that ∑i=1nqiCi+∑j=1mrj(Aj+sj)=0.
15.20 Proof of 13.3.2
Let PV be a function from F×F0 to
R≥0 that satisfies definition 13.3.
Let A,B be elements of F such that A.B=0 and C∈F0.
Then A+B=1−(1−A).(1−B) is an element of F.
If (1−B).C=0, then B.C=C, i.e. PV(B∣C)=PV(B.C∣C)=PV(C∣C)=1.
Also A.C=A.B.C=0 implying PV(A∣C)=PV(A.C∣C)=0. Finally, PV(A+B∣C)=PV((A+B).C∣C)=PV(A.C+B.C∣C)=PV(C∣C)=1.
Therefore, PV(A+B∣C)=PV(A∣C)+PV(B∣C).
If (1−B).C=0, then PV(A+B∣C)=1−PV((1−A).(1−B)∣C)=1−PV((1−A)∣(1−B).C).PV(1−B∣C)=1−[1−PV(A∣(1−B).C)].PV(1−B∣C)=1−PV(1−B∣C)+PV(A∣(1−B).C).PV(1−B∣C)=PV(B∣C)+PV(A.(1−B)∣C)=PV(B∣C)+PV(A−A.B∣C)=PV(B∣C)+PV(A∣C).
15.21 Proof of 13.3.4
Let PV be a function from F×F0 to
R≥0 that satisfies definition 13.3.
Let n≥0,m≥1 be integers, q1,…,qn be nonnegative
real numbers, r1,…,rm, s1,…,sm be real
numbers, C1,…,Cn be events, A1,…,Am be
elements of F, D1,…,Dm be elements of F0
and for every j∈1,…,m, rj(PV(Aj∣Dj)+sj)>0.
Let D=⋁j=1mDj. Since PV is subadditive, PV(⋁j=1mDj∣D)≤∑j=1mPV(Dj∣D).
Since 0<PV(D∣D), there is a k∈{1,…,m} such that 0<PV(Dk∣D).
Let j∈1,…,m. Due to Bayes’ rule and nonnegativity,
[TABLE]
If k∈{1,…,m} is such that 0<PV(Dk∣D), then rk(PV(AkDk∣D)+skPV(Dk∣D))>0.
Since such k∈{1,…,m} exists, ∑j=1mrj(PV(AjDj∣D)+sjPV(Dj∣D))>0.
Let A be the Boolean algebra generated by A1,…,Am,D1,…,Dm.
Since A is finitely generated, it is finite and atomic
Givant, Halmos (2008). Let at(A) be the set
of atoms of A.
Let B be the Boolean algebra generated by A1,…,Am,D1,…,Dm,C1,…,Cn.
Since B is finitely generated, it is finite and atomic
Givant, Halmos (2008). Let at(B) be the set
of atoms of B and span(B)
be the linear span of B.
Define a function φ from span(B)×at(B)
to R so that if X∈span(B)
and G∈at(B), then X.G=φ(X,G).G.
Also define a function ν from B to R
so that if H∈B, then ν(H)=∑G∈at(B)φ(H,G).
Finally, define a function F from span(B)×{D}
to R so that if X∈span(B),
then
[TABLE]
The reader can verify that F is homogeneous and additive on span(B)×{D},
nonnegative on B×{D} and that it coincides with
PV on A×{D}. Therefore,
[TABLE]
Also, F(∑i=1nqiCi∣D)≥0. Therefore, F(∑i=1nqiCi+∑j=1mrj(Aj+sj).Dj∣D)>0.
Since F(0∣D)=0, this proves that ∑i=1nqiCi+∑j=1mrj(Aj+sj).Dj=0.
15.22 Proof of 13.4.2
Let PV be a partial function from T×C
to R that satisfies definition 13.4.
Let n≥0,m≥1 be integers, q1,…,qn be nonnegative
real numbers, r1,…,rm, s1,…,sm be real
numbers, C1,…,Cn, D1,…,Dm be events, X1,…,Xm
be random quantities and for every j∈1,…,m, rj(PV(Xj∣Dj)+sj)>0.
Due to additivity of PV, if C∈C, then PV(0∣C)=PV(0∣C)+PV(0∣C),
which implies that PV(0∣C)=0.
Let D=⋁j=1mDj. Due to subadditivity, PV(D∣D)≤∑j=1mPV(Dj∣D).
Since 0<PV(D∣D), there is a k∈{1,…,m} such that 0<PV(Dk∣D).
Let j∈1,…,m. Due to Bayes’ rule and nonnegativity,
[TABLE]
If k∈{1,…,m} is such that 0<PV(Dk∣D), then rk(PV(Xk.Dk∣D)+skPV(Dk∣D))>0.
Since such k∈{1,…,m} exists, ∑j=1mrj(PV(Xj.Dj∣D)+sjPV(Dj∣D))>0.
Additivity and homogeneity of PV imply that PV(∑j=1mrj(Xj+sj).Dj∣D)>0.
Moreover, PV(∑i=1nqiCi∣D)≥0, i.e. also PV(∑i=1nqiCi+∑j=1mrj(Xj+sj).Dj∣D)>0.
Since PV(0∣D)=0, this proves that ∑i=1nqiCi+∑j=1mrj(Xj+sj).Dj=0.
15.23 Proof of 13.5
We start by proving that 1⇒2.
Let PV be a coherent partial function from T×E(T)
to R.
Define the relation ⋦ so that 0⋦X if X=∑i=1npiCi+∑j=1mrj(Xj+sj).Dj
for some nonnegative integers n, m, positive real numbers p1,…,pn,
real numbers r1,…,rm, s1,…,sm, nonzero
events C1,…,Cn, events D1,…,Dm and random
quantities X1,…,Xm, such that at least one of n,
m is nonzero and rj(PV(Xj∣Dj)+sj)>0
for every j∈{1,…,m}. Define X⋦Y
if 0⋦Y−X. According to this definition, for every event C
holds that 0⋦C. This guarantees both regularity and plausible
property 6.3.1. The antireflexivity of ⋦
is a consequence of the coherence of PV and 3.5.
We leave the task to verify that ⋦ has the remaining fundamental
properties of a plausible strict partial order listed in 6.3
as an exercise to the reader.
Take ≲ as a plausible preorder having ⋦ as its strict
part. By 6.4, such a plausible preorder
exists.
If PV(X∣C)=+∞ and y is a real number, then PV(X∣C)−y=+∞>0,
i.e. yC⋦X.C, proving that E(X∣C)=+∞.
If PV(X∣C)=−∞ and y is a real number, then (−1).(PV(X∣C)−y)=+∞>0,
i.e. X.C⋦yC, proving that E(X∣C)=−∞.
If r=PV(X∣C) is a real number and ϵ is a positive
real number, then PV(X∣C)−r+ϵ>0, i.e., −ϵC⋦X.C−rC,
and (−1).(PV(X∣C)−r−ϵ)>0,
i.e. X.C−rC⋦ϵC, proving that E(X∣C)=r.
This completes the proof that the expectation naturally induced by
≲ extends the function PV.
The implication 2⇒3 is trivial.
We finish by proving that 3⇒1.
Let ≲ be a plausible preorder and E be the conditional
expectation induced by ≲. Let n≥0, m≥1 be integers,
q1,…,qn be nonnegative real numbers, r1,…,rm,
s1,…,sm be real numbers, C1,…,Cn, D1,…,Dm
be events, X1,…,Xm be random quantities and for every
j∈{1,…,m}, rj(E(Xj∣Dj)+sj)>0.
By 4.1, 0≲∑i=1nqiCi. By real
additivity and homogeneity, E(rj(Xj+sj)∣Dj)=rj(E(Xj∣Dj)+sj)>0.
By preorder consistency, if E(rj(Xj+sj)∣Dj)>0,
then 0⋦rj(Xj+sj).Dj, yielding 0⋦∑i=1nqiCi+∑j=1mrj(Xj+sj).Dj.
Due to 6.3.2, 0=∑i=1nqiCi+∑j=1mrj(Xj+sj).Dj.