Products of Conditional Expectation Operators: Convergence and   Divergence

Guolie Lan; Ze-Chun Hu; Wei Sun

arXiv:1903.03917·math.PR·July 8, 2019

Products of Conditional Expectation Operators: Convergence and Divergence

Guolie Lan, Ze-Chun Hu, Wei Sun

PDF

Open Access

TL;DR

This paper explores the convergence properties of products of conditional expectation operators, demonstrating divergence in non-atomic spaces and convergence in purely atomic spaces, thus resolving a long-standing conjecture.

Contribution

It proves that products of conditional expectation operators can diverge in non-atomic spaces and always converge in purely atomic spaces, settling a major open question.

Findings

01

Divergence of operator products in non-atomic spaces.

02

Convergence of operator products in purely atomic spaces.

03

Resolution of a long-standing conjecture on conditional expectations.

Abstract

In this paper, we investigate the convergence of products of conditional expectation operators. We show that if $(Ω, F, P)$ is a probability space that is not purely atomic, then divergent sequences of products of conditional expectation operators involving 3 or 4 sub- $σ$ -fields of $F$ can be constructed for a large class of random variables in $L^{2} (Ω, F, P)$ . This settles in the negative a long-open conjecture. On the other hand, we show that if $(Ω, F, P)$ is a purely atomic probability space, then products of conditional expectation operators involving any finite set of sub- $σ$ -fields of $F$ must converge for all random variables in $L^{1} (Ω, F, P)$ .

Equations158

X_{n} = E (X_{n - 1} ∣ F_{n}), n \geq 1.

X_{n} = E (X_{n - 1} ∣ F_{n}), n \geq 1.

x_{n} = P_{H_{k_{n}}} x_{n - 1}, n \geq 1.

x_{n} = P_{H_{k_{n}}} x_{n - 1}, n \geq 1.

X_{n} = E (X_{n - 1} ∣ C_{k_{n}}), n \geq 1

X_{n} = E (X_{n - 1} ∣ C_{k_{n}}), n \geq 1

X_{0} = E (X ∣ C_{0}), X_{n} = E (X_{n - 1} ∣ C_{k_{n}}), n \geq 1

X_{0} = E (X ∣ C_{0}), X_{n} = E (X_{n - 1} ∣ C_{k_{n}}), n \geq 1

X_{n} = E (X_{n - 1} ∣ G_{k_{n}}), n \geq 1

X_{n} = E (X_{n - 1} ∣ G_{k_{n}}), n \geq 1

X_{\infty} = E (X_{0} k = 1 ⋂ K \overline{G_{k}}) .

X_{\infty} = E (X_{0} k = 1 ⋂ K \overline{G_{k}}) .

E (Y ∣ X) = a X + c, E (X ∣ Y) = bY + d .

E (Y ∣ X) = a X + c, E (X ∣ Y) = bY + d .

E (X Y) = a E (X^{2}) = b E (Y^{2}) .

E (X Y) = a E (X^{2}) = b E (Y^{2}) .

(T_{X} T_{Y})^{n} X \to E [X ∣ \overline{σ (X)} \cap \overline{σ (Y)}], a . s ..

(T_{X} T_{Y})^{n} X \to E [X ∣ \overline{σ (X)} \cap \overline{σ (Y)}], a . s ..

(T_{X} T_{Y})^{n} X = (ab)^{n} X \to 0, a . s ..

(T_{X} T_{Y})^{n} X = (ab)^{n} X \to 0, a . s ..

E [X ∣ \overline{σ (X)} \cap \overline{σ (Y)}] = 0 a . s ..

E [X ∣ \overline{σ (X)} \cap \overline{σ (Y)}] = 0 a . s ..

E [Y ∣ \overline{σ (X)} \cap \overline{σ (Y)}] = 0 a . s ..

E [Y ∣ \overline{σ (X)} \cap \overline{σ (Y)}] = 0 a . s ..

E (X ∣ Y) = E (X) and E (Y ∣ X) = E (Y) .

E (X ∣ Y) = E (X) and E (Y ∣ X) = E (Y) .

\int_{Ω} E (X ∣ Y) \cdot Y^{2} d P = E (X Y^{2}) = E (Y^{2}) = 2 [P (A) - P (A \cap B)] > 0

\int_{Ω} E (X ∣ Y) \cdot Y^{2} d P = E (X Y^{2}) = E (Y^{2}) = 2 [P (A) - P (A \cap B)] > 0

\int_{Ω} E (X) \cdot Y^{2} d P = P (A \cup B) E (Y^{2}) .

\int_{Ω} E (X) \cdot Y^{2} d P = P (A \cup B) E (Y^{2}) .

E (X_{0} ∣ X_{1}, \dots, X_{n}) = a_{0} + i = 1 \sum n a_{i} X_{i} .

E (X_{0} ∣ X_{1}, \dots, X_{n}) = a_{0} + i = 1 \sum n a_{i} X_{i} .

E (X_{0} ∣ X_{1}, \dots, X_{n}) = E (X_{0}) .

E (X_{0} ∣ X_{1}, \dots, X_{n}) = E (X_{0}) .

E (X_{0} ∣ X_{i}, i \geq 1) = i = 1 \sum \infty a_{i} X_{i} .

E (X_{0} ∣ X_{i}, i \geq 1) = i = 1 \sum \infty a_{i} X_{i} .

E (X_{0} ∣ X_{i}, i \geq 1) = 0.

E (X_{0} ∣ X_{i}, i \geq 1) = 0.

E (X_{0} ∣ X_{1}, \dots, X_{n}) = i = 1 \sum n a_{n, i} X_{i} .

E (X_{0} ∣ X_{1}, \dots, X_{n}) = i = 1 \sum n a_{n, i} X_{i} .

E (X_{0} ∣ X_{1}, \dots, X_{n}) \to E (X_{0} ∣ X_{i}, i \geq 1) in L^{2} (Ω, F, P),

E (X_{0} ∣ X_{1}, \dots, X_{n}) \to E (X_{0} ∣ X_{i}, i \geq 1) in L^{2} (Ω, F, P),

Y = i = 1 \sum \infty a_{i} X_{i} .

Y = i = 1 \sum \infty a_{i} X_{i} .

E (h ∣ G) = P_{G} h .

E (h ∣ G) = P_{G} h .

E (h ∣ G) = i \geq 1 \sum a_{i} g_{i} .

E (h ∣ G) = i \geq 1 \sum a_{i} g_{i} .

E (g h_{0} ∣ G) = g [E (h ∣ G) - i \geq 1 \sum a_{i} g_{i}] = 0.

E (g h_{0} ∣ G) = g [E (h ∣ G) - i \geq 1 \sum a_{i} g_{i}] = 0.

i \geq 1 \sum P (B_{i}) = 1, P (B_{i} \cap B_{j}) = 0, i \neq = j .

i \geq 1 \sum P (B_{i}) = 1, P (B_{i} \cap B_{j}) = 0, i \neq = j .

u = i = 1 \sum \infty 2^{- i} \cdot u [i],

u = i = 1 \sum \infty 2^{- i} \cdot u [i],

h_{k} (u) = i = 1 \sum \infty 2^{- i} \cdot u [2^{k - 1} \cdot (2 i - 1)] .

h_{k} (u) = i = 1 \sum \infty 2^{- i} \cdot u [2^{k - 1} \cdot (2 i - 1)] .

(0, 1) \to (0, 1)^{\infty}, u \mapsto (h_{1} (u), h_{2} (u), \dots) .

(0, 1) \to (0, 1)^{\infty}, u \mapsto (h_{1} (u), h_{2} (u), \dots) .

g_{k} (u) = Φ^{- 1} \circ h_{k} (u) .

g_{k} (u) = Φ^{- 1} \circ h_{k} (u) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematical Analysis and Transform Methods · Approximation Theory and Sequence Spaces · Probability and Risk Models

Full text

Products of Conditional Expectation Operators: Convergence and Divergence

Guolie Lana, Ze-Chun Hub and Wei Sunc

a School of Economics and Statistics, Guangzhou University, China

[email protected]

b College of Mathematics, Sichuan University, China

[email protected]

c Department of Mathematics and Statistics, Concordia University, Canada

[email protected]

Abstract In this paper, we investigate the convergence of products of conditional expectation operators. We show that if $(\Omega,\mathcal{F},P)$ is a probability space that is not purely atomic, then divergent sequences of products of conditional expectation operators involving 3 or 4 sub- $\sigma$ -fields of $\mathcal{F}$ can be constructed for a large class of random variables in $L^{2}(\Omega,\mathcal{F},P)$ . This settles in the negative a long-open conjecture. On the other hand, we show that if $(\Omega,\mathcal{F},P)$ is a purely atomic probability space, then products of conditional expectation operators involving any finite set of sub- $\sigma$ -fields of $\mathcal{F}$ must converge for all random variables in $L^{1}(\Omega,\mathcal{F},P)$ .

Keywords product of conditional expectation operators, Amemiya-Ando conjecture, non-atomic $\sigma$ -field, purely atomic $\sigma$ -field, linear compatibility, deeply uncorrelated.

Mathematics Subject Classification (2010) 60A05, 60F15, 60F25

1 Introduction and main results

Conditional expectation is one of the most important concepts in probability theory. It plays a central role in probability and statistics. Let $(\Omega,\mathcal{F},P)$ be a probability space and ${\mathcal{G}}_{1},{\mathcal{G}}_{2},\dots,{\mathcal{G}}_{K}$ be sub- $\sigma$ -fields of $\mathcal{F}$ , where $K\in\mathbb{N}$ . For $k=1,\dots,K$ , denote by $E_{k}$ the conditional expectation operator with respect to $\mathcal{G}_{k}$ , i.e., $E_{k}X={E}(X|\,\mathcal{G}_{k})$ for $X\in L^{1}(\Omega,\mathcal{F},P)$ . Suppose that $\mathcal{F}_{1},\mathcal{F}_{2},\dots\in\{{\mathcal{G}}_{1},{\mathcal{G}}_{2},\dots,{\mathcal{G}}_{K}\}$ . For $X_{0}\in L^{1}(\Omega,\mathcal{F},P)$ , define the sequence $\{X_{n}\}$ successively by

[TABLE]

Then $X_{n}=E_{k_{n}}\cdots E_{k_{1}}X_{0}$ for some sequence $k_{1},k_{2},\dots\in\{1,2,\dots,K\}$ . In this paper, we will investigate the convergence of $\{X_{n}\}$ .

Note that conditional expectation operators can be regarded as contraction operators on Banach spaces. The study on the convergence of $\{X_{n}\}$ is not only of intrinsic interest, but is also important in various applications including numerical solutions of linear equations and partial differential equations [4, 14], linear inequalities [24], approximation theory [20, 9] and computer tomography [23].

In 1961, Burkholder and Chow [6] initiated the study of convergence of products of conditional expectations. They focused on the case $K=2$ and showed that $\{X_{n}\}$ converges almost everywhere and in $L^{2}$ -norm for $X_{0}\in L^{2}(\Omega,\mathcal{F},P)$ . Further, it follows from Stein [25] and Rota [21] that if $X_{0}\in L^{p}(\Omega,\mathcal{F},P)$ for some $p>1$ , then $\{X_{n}\}$ converges almost everywhere. On the other hand, Burkholder [5] and Ornstein [16] showed that for $X_{0}\in L^{1}(\Omega,\mathcal{F},P)$ almost everywhere convergence need not hold necessarily.

However, for $K\geq 3$ the convergence of $\{X_{n}\}$ becomes a very challenging problem. This paper is devoted to the following long-open conjecture on convergence of products of conditional expectations:

(CPCE) If $X_{0}\in L^{2}(\Omega,\mathcal{F},P)$ and all the $\mathcal{F}_{n}$ come from a finite set of sub- $\sigma$ -fields of ${\mathcal{F}}$ , then $\{X_{n}\}$ must converge in $L^{2}$ -norm.

Conjecture (CPCE) is closely related to the convergence of products of orthogonal projections in Hilbert spaces. Before stating the main results of this paper, let us recall the important results obtained so far for the convergence of products of orthogonal projections in Hilbert spaces.

Let $H$ be a Hilbert space and $H_{1},H_{2},\dots,H_{K}$ be closed subspaces of $H$ , where $K\in\mathbb{N}$ . Denote by $P_{H_{k}}$ the orthogonal projection of $H$ onto $H_{k}$ . Let $x_{0}\in H$ and $k_{1},k_{2},\dots\in\{1,2,\dots,K\}$ , we define the sequence $\{x_{n}\}$ by

[TABLE]

If $K=2$ , the convergence of $\{x_{n}\}$ in $H$ follows from a classical result of von Neumann [15, Lemma 22]. If $K\geq 3$ and $H$ is finite dimensional, the convergence of $\{x_{n}\}$ was proved by Práger [18]. If $H$ is infinite dimensional and $\{k_{n}\}$ is periodic, the convergence of $\{x_{n}\}$ was obtained by Halperin [10]. Halperin’s result was then generalized to the quasi-periodic case by Sakai [22]. Based on the results on the convergence of products of orthogonal projections, Zaharopol [26], Delyon and Delyon [8], and Cohen [7] proved that if $\{\mathcal{F}_{n}\}$ is a periodic sequence with all the $\mathcal{F}_{n}$ coming from a finite set of $\sigma$ -fields, then for any $X_{0}\in L^{p}(\Omega,\mathcal{F},P)$ with $p>1$ , the sequence $\{X_{n}\}$ of the form (1.1) converges in $L^{p}$ -norm and almost everywhere.

In 1965, Amemiya and Ando [2] considered the more general convergence problem when $K\geq 3$ and $\{k_{n}\}$ is non-periodic. They showed that for arbitrary sequence $\{k_{n}\}$ , the sequence $\{x_{n}\}$ of the form (1.2) converges weakly in $H$ , and they posed the question if $\{x_{n}\}$ converges also in the norm of $H$ . In 2012, Paszkiewicz [17] constructed an ingenious example of 5 subspaces of $H$ and a sequence $\{x_{n}\}$ of the form (1.2) which does not converge in $H$ . Kopecká and Müller resolved in [12] fully the question of Amemiya and Ando. They refined Paszkiewicz’s construction to get an example of 3 subspaces of $H$ and a sequence $\{x_{n}\}$ which does not converge in $H$ . In [13], Kopecká and Paszkiewicz considerably simplified the construction of [12] and obtained improved results on the divergence of products of orthogonal projections in Hilbert spaces.

Note that a projection on $L^{2}(\Omega,\mathcal{F},P)$ can not necessary be represented as a conditional expectation operator. Thus, counterexamples for the convergence of products of orthogonal projections in Hilbert spaces do not necessarily yield divergent sequences of products of conditional expectation on probability spaces. Let ${\mathcal{B}}(\mathbb{R})$ be the Borel $\sigma$ -field of $\mathbb{R}$ and $dx$ be the Lebesgue measure. In 2017, Komisarski [11] showed that there exist $X_{0}\in L^{1}(\mathbb{R})\cap L^{2}(\mathbb{R})$ and $\{\mathcal{F}_{n}\}$ coming from 5 sub- $\sigma$ -fields of ${\mathcal{B}}(\mathbb{R})$ such that the sequence $\{X_{n}\}$ of the form (1.1) diverges in $L^{2}(\mathbb{R})$ . Note that $(\mathbb{R},{\mathcal{B}}(\mathbb{R}),dx)$ is only a $\sigma$ -finite measure space and the conditional expectation considered in [11] is understood in an extended sense. Conjecture (CPCE) still remains open for probability spaces. We would like to point out that Akcoglu and King [1] constructed an example of divergent sequences involving infinitely many sub- $\sigma$ -fields on the interval $[-\frac{1}{2},\frac{1}{2})$ .

In this paper, we will show that Conjecture (CPCE) falls if $(\Omega,\mathcal{F},P)$ is not a purely atomic probability space; however, it holds if $(\Omega,\mathcal{F},P)$ is a purely atomic probability space. More precisely, we will prove the following results.

Theorem 1.1

There exists a sequence $k_{1},k_{2},\dots\in\{1,2,3\}$ with the following property:

Suppose that $X_{0}$ is a Gaussian random variable on $(\Omega,\mathcal{F},P)$ and there exists a non-atomic $\sigma$ -field $\mathcal{C}\subset\mathcal{F}$ which is independent of $X_{0}$ . Then there exist three $\sigma$ -fields $\mathcal{C}_{1},\mathcal{C}_{2},\mathcal{C}_{3}\subset\mathcal{F}$ , such that the sequence $\{X_{n}\}$ defined by

[TABLE]

does not converge in probability.

Denote by $\mathcal{P}(\mathbb{R})$ the space of all probability measures on $\mathbb{R}$ . Then $\mathcal{P}(\mathbb{R})$ becomes a complete metric space if it is equipped with the Lévy-Prokhorov metric.

Theorem 1.2

There exists a sequence $k_{1},k_{2},\dots\in\{1,2,3\}$ with the following property:

(1) Suppose that $(\Omega,\mathcal{F},P)$ is a non-atomic probability space. Then there exists a dense subset ${\mathcal{D}}$ of $\mathcal{P}(\mathbb{R})$ such that for any $\mu\in{\mathcal{D}}$ we can find a random variable $X\in L^{2}(\Omega,\mathcal{F},P)$ with distribution $\mu$ and four $\sigma$ -fields $\mathcal{C}_{0},\mathcal{C}_{1},\mathcal{C}_{2},\mathcal{C}_{3}\subset\mathcal{F}$ , such that the sequence $\{X_{n}\}$ defined by

[TABLE]

does not converge in probability.

(2) Let $(\Omega,\mathcal{F},P)$ be a probability space that is not purely atomic. Then there exist a random variable $X_{0}\in L^{2}(\Omega,\mathcal{F},P)$ and three $\sigma$ -fields $\mathcal{G}_{1},\mathcal{G}_{2},\mathcal{G}_{3}\subset\mathcal{F}$ , such that the sequence $\{X_{n}\}$ defined by

[TABLE]

does not converge in probability.

Denote by ${\mathcal{N}}$ the collection of all null sets of $(\Omega,\mathcal{F},P)$ . For a sub- $\sigma$ -filed ${\mathcal{G}}$ of ${\mathcal{F}}$ , we define $\overline{{\mathcal{G}}}$ to be the $\sigma$ -field generated by ${\mathcal{G}}$ and ${\mathcal{N}}$ .

Theorem 1.3

Suppose that $(\Omega,\mathcal{F},P)$ is a purely atomic probability space, ${\mathcal{G}}_{1},\dots,{\mathcal{G}}_{K}$ are sub- $\sigma$ -fields of $\mathcal{F}$ , and $\mathcal{F}_{1},\mathcal{F}_{2},\dots\in\{{\mathcal{G}}_{1},\dots,{\mathcal{G}}_{K}\}$ . Let $X_{0}\in L^{p}(\Omega,\mathcal{F},P)$ with $p\geq 1$ and $\{X_{n}\}$ be defined by (1.1). Then $\{X_{n}\}$ converges to some $X_{\infty}\in L^{p}$ in $L^{p}$ -norm and almost everywhere. If each ${\mathcal{G}}_{k}$ repeats infinitely in the sequence $\{{\mathcal{F}}_{n}\}$ , then

[TABLE]

The rest of this paper is organized as follows. In Section 2, we discuss the linear compatibility under conditional expectations, which is essential for our construction of divergent sequences of products of conditional expectation operators. In Section 3, we consider divergent sequences of products of conditional expectation operators on probability spaces that are not purely atomic and prove Theorems 1.1 and 1.2. In Section 4, we investigate the convergence of products of conditional expectation operators on purely atomic probability spaces and prove Theorem 1.3.

2 Linear compatibility and deep uncorrelatedness

Let $(\Omega,\mathcal{F},P)$ be a probability space. We consider the linear compatibility defined by conditional linear equations, which is closely related to linear regression and optimal estimation (cf. Rao [19]).

Definition 2.1

Two integrable random variables $X,Y$ on $(\Omega,\mathcal{F},P)$ are said to be linearly compatible under conditional expectations, or linearly compatible in short, if there exist $a,b,c,d\in\mathbb{R}$ such that almost surely,

[TABLE]

Obviously, if $X$ and $Y$ are independent or perfectly collinear, i.e., $Y=aX+c$ , then they are linearly compatible. For non-trivial examples, note that if $(X,Y)$ have a 2-dimensional Gaussian distribution then they are linearly compatible, and if both $X$ and $Y$ follow two-point distributions, then they must be linearly compatible.

Lemma 2.2

Let $X$ and $Y$ be two random variables on $(\Omega,\mathcal{F},P)$ with $0<{\rm Var}(X)<\infty$ and $0<{\rm Var}(Y)<\infty$ . Suppose that (2.1) holds. Denote by $\rho_{XY}$ the correlation coefficient of $X$ and $Y$ , and denote by $\sigma(X)$ and $\sigma(Y)$ the $\sigma$ -fields generated by $X$ and $Y$ , respectively. Then,

(i)

$0\leq ab\leq 1$ * and ${ab}=\rho_{XY}^{2}$ ;*

(ii)

$ab=1$ * implies that $Y=aX+c$ a.s.;*

(iii)

$ab<1$ * implies that ${E}\left[X|\overline{\sigma(X)}\cap\overline{\sigma(Y)}\right]={E}(X)$ and ${E}\left[Y|\overline{\sigma(X)}\cap\overline{\sigma(Y)}\right]=E(Y)$ a.s.;*

(iv)

$ab=0$ * implies that $a=b=0$ .*

Proof. We assume without loss of generality that $E(X)=E(Y)=0$ . Then, $c=d=0$ in (2.1).

(i) It follows from (2.1) that ${E}(XY|X)=X{E}(Y|X)=aX^{2}$ and ${E}(XY|Y)=Y{E}(X|Y)=bY^{2}$ . Taking expectations, we get

[TABLE]

Then $\rho_{XY}^{2}=ab$ , which implies that $0\leq ab\leq 1$ .

(ii) is proved by Rao ([19, Proposition 2.1]), where only the finiteness of expectations is assumed.

(iii) We define the operators $T_{X}$ and $T_{Y}$ on $L^{2}(\Omega,\mathcal{F},P)$ by $T_{X}Z={E}(Z|X)$ and $T_{Y}Z={E}(Z|Y)$ for $Z\in L^{2}(\Omega,\mathcal{F},P)$ . By Burkholder and Chow [6, Theorem 3], we have

[TABLE]

On the other hand, since (2.1) holds with $ab<1$ ,

[TABLE]

By (2.3) and (2.4), we get

[TABLE]

Similarly, we can show that

[TABLE]

(iv) is a direct consequence of (2.2) since ${E}(X^{2})$ and ${E}(Y^{2})$ are non-zero.

Motivated by Lemma 2.2 (iv), we introduce the definition of deep uncorrelatedness for two random variables.

Definition 2.3

Two integrable random variables $X,Y$ on $(\Omega,\mathcal{F},P)$ are said to be deeply uncorrelated if

[TABLE]

Remark 2.4

It is clear that if $X$ and $Y$ are integrable and independent then they are deeply uncorrelated, and if $X$ and $Y$ have finite variances and are deeply uncorrelated then they are uncorrelated, i.e., $\rho_{XY}=0$ . The following examples show that deeply uncorrelated is equivalent to neither independent nor uncorrelated.

(i) Let $(X,Y)$ be a pair of random variables with the uniform distribution on the unit disc $\{(x,y):x^{2}+y^{2}\leq 1\}$ . It can be checked that $X$ and $Y$ are deeply uncorrelated but not independent.

(ii) Let $A$ and $B$ be two measurable sets satisfying $P(A)=P(B)>0$ and $P(A\cap B)<P(A\cup B)<1$ . Define $X={1}_{A\cup B}$ and $Y={1}_{A}-{1}_{B}$ . Note that $XY=Y$ . Then ${E}(XY)={E}(Y)=0$ , which implies that $(X,Y)$ are uncorrelated. However, we have that

[TABLE]

and

[TABLE]

Hence ${E}(X|Y)\neq E(X)$ , which implies that $X$ and $Y$ are not deeply uncorrelated.

We now define linear compatibility and deep uncorrelatedness for a family of random variables.

Definition 2.5

(1)* A family of integrable random variables $X_{S}=\{X_{s}\}_{s\in S}$ on $(\Omega,\mathcal{F},P)$ is said to be linearly compatible under conditional expectations, or linearly compatible in short, if for any finite sequence $X_{0},X_{1},\dots,X_{n}$ in $X_{S}$ , there exist $a_{0},a_{1},\dots,a_{n}\in\mathbb{R}$ such that almost surely,*

[TABLE]

(2)* $X_{S}$ is called a deeply uncorrelated family if for any finite sequence $X_{0},X_{1},\dots,X_{n}\in X_{S}$ ,*

[TABLE]

Remark 2.6

(i) Let $X\in L^{2}(\Omega,\mathcal{F},P)$ and ${\mathcal{C}}\subset{\mathcal{F}}$ . It is well-known that ${E}(X|\mathcal{C})$ provides the $L^{2}$ -optimal estimation of $X$ given $\mathcal{C}$ . Thus (2.5) implies that the $L^{2}$ -optimal estimation of $X_{0}$ via $X_{1},\dots,X_{n}$ is consistent with the optimal linear estimation via $X_{1},\dots,X_{n}$ .

(ii) An important class of linearly compatible family is Gaussian processes. For a Gaussian process $X_{T}=\{X_{t}\}_{t\in T}$ , every finite collection of random variables $\{X_{0},X_{1},\dots,X_{n}\}\subset X_{T}$ has a multivariate normal distribution. Thus (2.5) holds and therefore $X_{T}$ is linearly compatible.

(iii) Let $\{X_{n}\}_{n\geq 0}$ be a deeply uncorrelated family with ${E}X_{n}=0$ , $\forall n\geq 0$ . Define $Y_{n}=X_{0}+X_{1}+\cdots+X_{n}$ . Then $\{Y_{n}\}_{n\geq 0}$ is a martingale.

Lemma 2.7

Let $X_{S}=\{X_{s}\}_{s\in S}\subseteq L^{2}(\Omega,\mathcal{F},P)$ be a linearly compatible family with ${E}(X_{s})=0$ , $\forall s\in S$ . Then for any infinite sequence $\{X_{n}\}_{n\geq 0}\subset X_{S}$ , there exists $\{a_{n}\}_{n\geq 1}\subset\mathbb{R}$ such that almost surely,

[TABLE]

In particular, if $X_{S}$ is a deeply uncorrelated family then for any $\{X_{n}\}_{n\geq 0}\subset X_{S}$ ,

[TABLE]

Proof. Since ${E}(X_{n})=0$ for ${n\geq 0}$ , it follows from (2.5) that there exists $\{a_{n,m}:1\leq m\leq n,n\in\mathbb{N}\}\subset\mathbb{R}$ such that for each $n$ ,

[TABLE]

By the martingale convergence theorem, we have

[TABLE]

which implies that $\sum_{i=1}^{n}a_{n,i}X_{i}$ converges in $L^{2}(\Omega,\mathcal{F},P)$ .

Denote by $Y$ the limit of $\sum_{i=1}^{n}a_{n,i}X_{i}$ in $L^{2}(\Omega,\mathcal{F},P)$ . Then there exists $\{a_{n}\}_{n\geq 1}\subset\mathbb{R}$ such that

[TABLE]

Thus we obtain (2.6). The proof of (2.7) is similar and we omit the details.

Note that for $X\in L^{2}(\Omega,\mathcal{F},P)$ and ${\mathcal{C}}\subset{\mathcal{F}}$ , the conditional expectation ${E}(X|\,\mathcal{C})$ can be regarded as the orthogonal projection of $X$ onto the closed subspace $L^{2}(\Omega,\mathcal{C},P)$ . However, in general, the orthogonal projection of $L^{2}(\Omega,\mathcal{F},P)$ onto a closed linear subspace can not necessary be represented as a conditional expectation operator. The following lemma shows that the linear compatibility ensures the one-to-one correspondence between conditional expectation operator and orthogonal projection.

Lemma 2.8

Let $H$ be a closed linear subspace of $L^{2}(\Omega,\mathcal{F},P)$ . Suppose that $H$ is a linearly compatible family with ${E}(h)=0$ for any $h\in H$ . Then for each closed linear subspace $G\subseteq H$ with countable basis, there exists a sub- $\sigma$ -field $\mathcal{G}$ of $\mathcal{F}$ such that for any $h\in H$ ,

[TABLE]

Proof. Let $\{g_{i},i\geq 1\}$ be an orthonormal basis of $G$ and define $\mathcal{G}=\sigma(g_{i},i\geq 1)$ . By Lemma 2.7, for any $h\in H$ , there exists $\{a_{i}\}\subset\mathbb{R}$ such that

[TABLE]

Denote $h_{0}=h-\sum_{i\geq 1}a_{i}g_{i}.$ Then for any $g\in G$ , almost surely

[TABLE]

Hence ${E}(\,h_{0}g\,)=0$ , which implies that $h_{0}$ and $g$ are orthogonal in $L^{2}(\Omega,\mathcal{F},P)$ . Since $g\in G$ is arbitrary, the right hand side of (2.9) equals the orthogonal projection of $h$ onto $G$ . Therefore, (2.8) holds.

3 Divergent sequences on probability spaces that are not purely atomic

Definition 3.1

Let $(\Omega,\mathcal{F},P)$ be a probability space and $\mathcal{C}$ be a sub- $\sigma$ -field of $\mathcal{F}$ .

(1) A measurable set $B\in\mathcal{C}$ is called $\mathcal{C}$ -atomic if $P(B)>0$ and for any $\mathcal{C}$ -measurable set $A\subset B$ , it holds that either $P(A)=0$ or $P(A)=P(B)$ .

(2) $\mathcal{C}$ is called non-atomic if it contains no $\mathcal{C}$ -atomic set, i.e., for each $B\in\mathcal{C}$ with $P(B)>0$ , there exists a $\mathcal{C}$ -measurable set $A\subset B$ such that $0<P(A)<P(B)$ .

(3) $\mathcal{C}$ is called purely atomic if it contains a countable number of $\mathcal{C}$ -atomic sets $B_{1},B_{2},\dots$ such that

[TABLE]

(4)* $(\Omega,\mathcal{F},P)$ is said to be non-atomic if $\mathcal{F}$ is non-atomic. $(\Omega,\mathcal{F},P)$ is said to be purely atomic if $\mathcal{F}$ is purely atomic.*

Remark 3.2

(i) Note that $\mathcal{C}$ is non-atomic if it is generated by a random variable whose cumulative distribution function (cdf) is continuous on $\mathbb{R}$ , for example, a continuous random variable. Moreover, $\mathcal{C}$ is non-atomic if and only if there exists a random variable $X\in\mathcal{C}$ which has a uniform distribution on $(0,1)$ (cf. [3, §2]).

(ii) $\mathcal{C}$ is purely atomic if it is generated by a discrete random variable. Conversely, if $\mathcal{C}$ is purely atomic, then each $\mathcal{C}$ -measurable random variable has a discrete distribution on $\mathbb{R}$ .

We now prove Theorems 1.1 and 1.2, which are stated in §1. Our proofs are based on the following remarkable result.

Theorem 3.3

(Kopecká and Paszkiewicz [13, Theorem 2.6])) There exists a sequence $k_{1},k_{2},\dots\in\{1,2,3\}$ with the following property:

If $H$ is an infinite-dimensional Hilbert Space and $0\neq w_{0}\in H$ , then there exist three closed subspaces $G_{1},G_{2},G_{3}\subset H$ , such that the sequence of iterates $\{w_{n}\}_{n\geq 1}$ defined by $w_{n}=P_{G_{k_{n}}}w_{n-1}$ does not converges in $H$ .

Proof of Theorem 1.1 Denote by $\gamma^{1}$ the standard Gaussian measure on $\mathbb{R}$ and denote by $\gamma^{\infty}=\gamma^{1}\times\gamma^{1}\times\cdots$ the standard Gaussian measure on $\mathbb{R}^{\infty}$ . For $u\in(0,1)$ , we consider its binary representation:

[TABLE]

where $u[1],u[2],\dots\in\{0,1\}$ . For $k\geq 1$ , define

[TABLE]

Let $\Psi$ be the map

[TABLE]

Denote by $dx$ the Lebesgue measure on $(0,1)$ . Then it can be checked that the image measure of $dx$ under $\Psi$ equals the infinite product measure $(dx)^{\infty}$ on $(0,1)^{\infty}$ . Let $\Phi$ be the cdf of $\gamma^{1}$ . Define

[TABLE]

Let $g$ be the map

[TABLE]

Then the image measure of $dx$ under $g$ equals the standard Gaussian measure $\gamma^{\infty}$ on $\mathbb{R}^{\infty}$ .

Since $\mathcal{C}$ is a non-atomic sub- $\sigma$ -field which is independent of $X_{0}$ , there exists a random variable $Y\in\mathcal{C}$ which has a uniform distribution on $(0,1)$ and is independent of $X_{0}$ . Define

[TABLE]

Set

[TABLE]

Then

[TABLE]

is a sequence of independent standard Gaussian random variables. Let

[TABLE]

be the closed linear span of $(Z_{0},Z_{1},Z_{2},\dots)$ . Then $H$ is an infinite-dimension Gaussian Hilbert space, i.e., a Gaussian process which is also a Hilbert subspace of $L^{2}(\Omega,\mathcal{F},P)$ .

We now show that $H$ is a linearly compatible family. Take $u,v_{1},\dots,v_{n}\in H$ . Let

[TABLE]

be the linear span of $(v_{1},\dots,v_{n})$ . Then the orthogonal projection of $u$ onto $V$ can be written as

[TABLE]

for some $a_{1},\dots,a_{n}\in\mathbb{R}$ . Define

[TABLE]

Then $u_{0}$ is orthogonal to $V$ and hence is independent of $\sigma(V)=\sigma(v_{1},\dots,v_{n})$ , since $(u,v_{1},\dots,v_{n})$ have a joint Gaussian distribution. Therefore,

[TABLE]

which implies that $H$ is linearly compatible.

Applying Theorem 3.3 to the infinite-dimensional Hilbert space $H$ , we find that there exists a sequence $k_{1},k_{2},\dots\in\{1,2,3\}$ with the following property:

For $0\neq x_{0}\in H$ , there exist three closed subspaces $G_{1},G_{2},G_{3}\subset H$ , such that the sequence $\{x_{n}\}$ defined by

[TABLE]

does not converges in $L^{2}$ -norm.

Hence there exist three closed subspaces $H_{1},H_{2},H_{3}\subset H$ , such that the sequence $\{X_{n}\}$ defined by

[TABLE]

does not converges in $L^{2}$ -norm.

On the other hand, by Lemma 2.8, there exist three sub- $\sigma$ -fields $\mathcal{C}_{1},\mathcal{C}_{2},\mathcal{C}_{3}\subset\mathcal{F}$ such that

[TABLE]

Therefore,

[TABLE]

Finally, we show that $\{X_{n}\}$ does not converge in probability. Suppose that $X_{n}$ converges to some $X_{\infty}$ in probability. Note that

[TABLE]

which implies that $\{X_{n}^{2}\}$ is uniformly integrable. Therefore $X_{n}\rightarrow X_{\infty}$ in $L^{2}$ -norm. We have arrived at a contradiction.

Proof of Theorem 1.2 (1). Since $\mathcal{F}$ is non-atomic, there exists a random variable $Z\in\mathcal{F}$ which has a uniform distribution on $(0,1)$ . Following the first part of the proof of Theorem 1.1, we can construct three independent standard Gaussian random variables $Y_{0},Y_{1},Y_{2}$ on $(\Omega,\mathcal{F},P)$ .

For $\varepsilon>0$ , let $\gamma_{\varepsilon}$ be the Gaussian measure on $\mathbb{R}$ with mean 0 and variance $\varepsilon^{2}$ . For $\nu\in\mathcal{P}(\mathbb{R})$ , define $\nu\ast\gamma_{\varepsilon}$ to be the convolution of $\nu$ and $\gamma_{\varepsilon}$ , i.e.,

[TABLE]

Then all the moments of $\nu\ast\gamma_{\varepsilon}$ are finite and $\nu\ast\gamma_{\varepsilon}\rightarrow\nu$ weakly as $\varepsilon\rightarrow 0$ . Define

[TABLE]

$\mathcal{P}_{\gamma}(\mathbb{R})$ is a dense subset of $\mathcal{P}(\mathbb{R})$ with respect to the Lévy-Prokhorov metric.

For $\mu\in\mathcal{P}_{\gamma}(\mathbb{R})$ with $\mu=\nu\ast\gamma_{\varepsilon}$ . Define

[TABLE]

Then $G\circ\Phi(Y_{1})$ has the probability distribution $\nu$ , where $\Phi$ is the cdf of a standard Gaussian random variable. Define

[TABLE]

Then $X$ has the probability distribution $\mu$ . Write

[TABLE]

Then

[TABLE]

is a Gaussian random variable which is independent of $\mathcal{C}$ . Therefore, by Theorem 1.1 we can find three sub- $\sigma$ -fields $\mathcal{C}_{1},\mathcal{C}_{2},\mathcal{C}_{3}\subset\mathcal{F}$ , such that the sequence $\{X_{n}\}$ defined by

[TABLE]

does not converge in probability.

Proof of Theorem 1.2 (2). Suppose that $(\Omega,\mathcal{F},P)$ is neither purely atomic nor non-atomic. Then there exist an $N\in\mathbb{N}\cup\{\infty\}$ and a sequence of atomic sets $\{B_{i}\}_{i=1}^{N}\subset\mathcal{F}$ such that $\Omega\setminus\bigcup_{i=1}^{N}B_{i}\in{\mathcal{F}}$ is a non-atomic set. Denote

[TABLE]

Define

[TABLE]

and

[TABLE]

Then $(C,\mathcal{F}_{C},P_{C})$ is a non-atomic probability space.

For a sub- $\sigma$ -field $\mathcal{G}$ of $\mathcal{F}_{C}$ , define

[TABLE]

Then $\mathcal{G}\uplus D$ is a sub- $\sigma$ -field of $\mathcal{F}$ . For a random variable $X$ on $(C,\mathcal{F}_{C},P_{C})$ , we can extend it to a random variable $1_{C}\cdot X$ on $(\Omega,\mathcal{F},P)$ by defining $(1_{C}\cdot X)(\omega)=0$ for $\omega\in D$ . We claim that

[TABLE]

where $E_{C}(\cdot|\,\mathcal{G})$ denotes the conditional expectation on $(C,\mathcal{F}_{C},P_{C})$ . In fact, the right hand side of (3.2) is obviously $\mathcal{G}\uplus D$ -measurable. Thus it is sufficient to show that

[TABLE]

for any $B\in\mathcal{G}\uplus D$ .

Note that by (3.1) we have that $B=G\cup D$ or $B=G$ for some $G\in\mathcal{G}$ .

Case 1: Suppose that $B=G\cup D$ for $G\in\mathcal{G}$ . Then the left hand side of (3.3) is

[TABLE]

Thus (3.3) holds.

Case 2: Suppose that $B=G$ for some $G\in\mathcal{G}$ . The proof is similar to that of Case 1 and we omit the details.

Note that $(C,\mathcal{F}_{C},P_{C})$ is a non-atomic probability space. Then there exists on $(C,\mathcal{F}_{C},P_{C})$ a random variable $Z_{C}$ which has a uniform distribution on $(0,1)$ . Following the first part of the proof of Theorem 1.1, we can construct on $(C,\mathcal{F}_{C},P_{C})$ two independent standard Gaussian random variables $Y_{0}$ and $Y_{C}$ . Let

[TABLE]

Then $\mathcal{C}$ is a non-atomic sub- $\sigma$ -field of $\mathcal{F}_{C}$ and is independent of the Gaussian random variable $Y_{0}$ . Thus, by Theorem 1.1, we can find three sub- $\sigma$ -fields $\mathcal{C}_{1},\mathcal{C}_{2},\mathcal{C}_{3}$ of $\mathcal{F}_{C}$ such that the sequence of iterates $\{Y_{n}\}$ on $(C,\mathcal{F}_{C},P_{C})$ defined by

[TABLE]

does not converge in probability. Hence $\{Y_{n}\}$ must diverge in the $L^{2}$ -norm of $(C,\mathcal{F}_{C},P_{C})$ , i.e.,

[TABLE]

where $\|Y\|_{2,\,C}:=[{E}_{C}(\,Y^{2})]^{1/2}$ is the $L^{2}$ -norm of $(C,\mathcal{F}_{C},P_{C})$ for $Y\in L^{2}(C,\mathcal{F}_{C},P_{C})$ .

Now we can construct on $(\Omega,\mathcal{F},P)$ three sub- $\sigma$ -fields $\mathcal{G}_{1},\mathcal{G}_{2},\mathcal{G}_{3}$ by

[TABLE]

and a sequence of random variables $\{X_{n}\}$ by

[TABLE]

Note that by (3.2), (3.4), (3.6) and (3.7) we have

[TABLE]

which implies that

[TABLE]

To show that $\{X_{n}\}$ does not converge in probability, note that

[TABLE]

Then we obtain by (3.5) and (3.8) that

[TABLE]

On the other hand, we can check that

[TABLE]

which implies that $\{X_{n}^{2}\}$ is uniformly integrable. Suppose that $\{X_{n}\}$ converges in probability. Then $\{X_{n}\}$ converges in the $L^{2}$ -norm, which contradicts with (3.9). Therefore, $\{X_{n}\}$ does not converge in probability.

4 Convergence on purely atomic probability spaces

In this section we will prove Theorem 1.3, which is stated in §1. First, we give a lemma that holds for any probability space.

Lemma 4.1

Let $(\Omega,\mathcal{F},P)$ be a probability space, $\{\mathcal{F}_{n}\}$ be a family of sub- $\sigma$ -fields of ${\mathcal{F}}$ and $\{X_{n}\}$ be defined by (1.1). Then the following statements are equivalent.

(1)* $\{X_{n}\}$ converges in ${L^{p}}$ -norm for any $X_{0}\in L^{p}(\Omega,\mathcal{F},P)$ with $p\geq 1$ .*

(2)* $\{X_{n}\}$ converges in probability for any $X_{0}\in L^{\infty}(\Omega,\mathcal{F},P)$ .*

Proof. It is sufficient to show that $(2)\Rightarrow(1)$ .

For $X\in L^{p}(\Omega,\mathcal{F},P)$ , we define the operators $T_{n}$ recursively by $T_{0}X=X$ and

[TABLE]

Then $\{T_{n}\}$ is a family of linear contraction operators on $L^{p}(\Omega,\mathcal{F},P)$ and $\sup_{n\geq 1}\|T_{n}Y\|_{\infty}\leq\|Y\|_{\infty}$ for $Y\in L^{\infty}(\Omega,\mathcal{F},P)$ . Let $X_{0}\in L^{p}(\Omega,\mathcal{F},P)$ . Then $T_{n}X_{0}=X_{n}$ for $n\geq 1$ .

Suppose that (2) holds, i.e., the sequence $\{T_{n}Y\}$ converges in probability for each $Y\in L^{\infty}(\Omega,\mathcal{F},P)$ . Then we obtain by the bounded convergence theorem that

[TABLE]

Hence $\{T_{n}X_{0}\}$ is a Cauchy sequence in $L^{p}(\Omega,\mathcal{F},P)$ and therefore converges in $L^{p}$ -norm.

Proof of Theorem 1.3. Let $B_{1},B_{2},\dots\in\mathcal{F}$ be atomic sets (cf. Definition 3.1) such that

[TABLE]

Let $X_{0}\in L^{\infty}(\Omega,\mathcal{F},P)$ . Then for each $X_{n}$ , there exists a sequence $b_{n,\,1},b_{n,\,2},\dots\in\mathbb{R}$ such that

[TABLE]

We define the orthogonal projections $P_{n}$ , $n\geq 1$ , on $L^{2}(\Omega,\mathcal{F},P)$ by

[TABLE]

Then $P_{n}\cdots P_{2}P_{1}X_{0}=X_{n}$ . By Amemiya-Ando [2, Theorem], $X_{n}$ converges weakly in $L^{2}(\Omega,\mathcal{F},P)$ . Thus $b_{n,i}=\frac{1}{P({B_{i}})}\int_{B_{i}}X_{n}dP$ converges as $n\rightarrow\infty$ for each $i\geq 1$ , which implies that $X_{n}$ converges almost everywhere. Since $X_{0}\in L^{\infty}(\Omega,\mathcal{F},P)$ is arbitrary, we obtain by Lemma 4.1 that $\{X_{n}\}$ converges to some $X_{\infty}\in L^{p}$ in ${L^{p}}$ -norm for any $X_{0}\in L^{p}(\Omega,\mathcal{F},P)$ with $p\geq 1$ . Further, $\{X_{n}\}$ converges to $X_{\infty}$ almost everywhere since $(\Omega,\mathcal{F},P)$ is a purely atomic probability space.

We now show that $X_{\infty}={E}(X_{0}|\bigcap_{k=1}^{K}\overline{{\mathcal{G}}_{k}})$ . For each $k\in\{1,\cdots,K\}$ , we can find an infinite subsequence $k_{1},k_{2},\dots\in\mathbb{N}$ such that $\mathcal{F}_{k_{n}}={\mathcal{G}}_{k},\ n\geq 1$ . It follows that

[TABLE]

By the almost sure convergence of $\{X_{n}\}$ , we have that

[TABLE]

which implies that $X_{\infty}\in\overline{{\mathcal{G}}_{k}}$ for each ${\mathcal{G}}_{k}$ . Hence

[TABLE]

Let $A\in\bigcap_{k=1}^{K}\overline{{\mathcal{G}}_{k}}=\bigcap_{n=1}^{\infty}\overline{\mathcal{F}_{n}}$ . By (1.1), we get

[TABLE]

Then,

[TABLE]

Thus

[TABLE]

Since $A\in\bigcap_{k=1}^{K}\overline{{\mathcal{G}}_{k}}$ is arbitrary, the proof is complete.

Acknowledgments This work was supported by the China Scholarship Council (No. 201809945013), National Natural Science Foundation of China (No. 11771309) and Natural Sciences and Engineering Research Council of Canada.

Bibliography26

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Akcoglu, J. L. King: An example of pointwise non-convergence of iterated conditional expectation operators, Israel J. Math., 94, 179-188 (1996).
2[2] I. Amemiya, T. Ando: Convergence of random pruducts of contractions in Hilberet space, Acta Scientiarum Mathematicarum (Szegeed), 26, 239-244 (1965).
3[3] P. Berti, L. Pratelli, P. Rigo: Atomic intersection of σ 𝜎 \sigma -fields and some of its consequences, Probab. Theory Relat. Fields, 148, 269-283 (2010).
4[4] F. E. Browder, On some approximation methods for solutions of the Dirichlet problem for linear elliptic equations of arbitrary order, J. Math. Mech., 7, 69-80 (1958).
5[5] D. L. Burkholder: Successive conditional expectations of an integrable function, Ann. Math. Stat., 33, 887-893 (1962).
6[6] D. L. Burkholder, Y. S. Chow: Iterates of conditional expectation operators, Proc. Amer. Math. Soc., 12, 490-495 (1961).
7[7] G. Cohen: Iterates of a product of conditional expectation operators, J. Func. Anal., 242, 658-668 (2007).
8[8] B. Delyon, F. Delyon: Generalization of von Neumann’s spctral sets and integral representation of operators, Bull. Soc. Math. France, 127, 25-41 (1999).