Decomposition of Gaussian processes, and factorization of positive   definite kernels

Palle Jorgensen; Feng Tian

arXiv:1812.10850·math.FA·December 31, 2018

Decomposition of Gaussian processes, and factorization of positive definite kernels

Palle Jorgensen, Feng Tian

PDF

Open Access

TL;DR

This paper establishes a duality between factorizations of positive definite kernels and Gaussian processes, providing explicit correspondences and applications in various fields like point processes and graph Laplacians.

Contribution

It introduces a novel duality framework linking kernel factorizations with Gaussian process factorizations, addressing measure-theoretic challenges in infinite dimensions.

Findings

01

Explicit duality between kernel and Gaussian process factorizations

02

Measure-theoretic methods for infinite-dimensional factorizations

03

Applications to point processes, graph Laplacians, and boundary-value problems

Abstract

We establish a duality for two factorization questions, one for general positive definite (p.d) kernels $K$ , and the other for Gaussian processes, say $V$ . The latter notion, for Gaussian processes is stated via Ito-integration. Our approach to factorization for p.d. kernels is intuitively motivated by matrix factorizations, but in infinite dimensions, subtle measure theoretic issues must be addressed. Consider a given p.d. kernel $K$ , presented as a covariance kernel for a Gaussian process $V$ . We then give an explicit duality for these two seemingly different notions of factorization, for p.d. kernel $K$ , vs for Gaussian process $V$ . Our result is in the form of an explicit correspondence. It states that the analytic data which determine the variety of factorizations for $K$ is the exact same as that which yield factorizations for $V$ . Examples and applications are included:…

Figures16

Click any figure to enlarge with its caption.

Tables1

Table 1. Table 6.1. Three p.d. kernels and their respective Gaussian realizations.

	$X$	$K$	$k_{x}, M = [0, 1] ≅ 𝕋^{1}$ , $ℱ (K) = {(k_{x}, μ)}$	$μ$
Ex 1	$[0, 1]$	$x \land y$	$k_{x} (s) = χ_{[0, x]} (s)$	$λ_{1}$
Ex 2	$𝔻$	$\frac{1}{1 - z \bar{w}}$	$k_{z} (t) = \frac{1}{1 - z \bar{e (t)}}$	$λ_{1}$ on $𝕋^{1}$
Ex 3	$𝔻$	$\prod_{n = 0}^{\infty} (1 + z^{4^{n}} {\bar{w}}^{4^{n}})$	$k_{z} (t) = \prod_{n = 0}^{\infty} (1 + z^{4^{n}} \bar{e (4^{n} t)})$	$μ_{4}$

Equations401

x \in F \sum y \in F \sum \overline{ξ}_{x} ξ_{y} K (x, y) \geq 0.

x \in F \sum y \in F \sum \overline{ξ}_{x} ξ_{y} K (x, y) \geq 0.

x \in F \sum ξ_{x} K (\cdot, x)

x \in F \sum ξ_{x} K (\cdot, x)

⟨ \sum_{x \in F} ξ_{x} K (\cdot, x), \sum_{y \in F} η_{y} K (\cdot, y) ⟩_{H (K)} := \sum \sum_{F \times F} \overline{ξ}_{x} η_{y} K (x, y) .

⟨ \sum_{x \in F} ξ_{x} K (\cdot, x), \sum_{y \in F} η_{y} K (\cdot, y) ⟩_{H (K)} := \sum \sum_{F \times F} \overline{ξ}_{x} η_{y} K (x, y) .

⟨ K (\cdot, x), h ⟩_{H (K)} = h (x);

⟨ K (\cdot, x), h ⟩_{H (K)} = h (x);

\sum_{x \in F} ξ_{x} h (x)^{2} \leq C_{h} \sum_{x \in F} \sum_{y \in F} \overline{ξ}_{x} ξ_{y} K (x, y) .

\sum_{x \in F} ξ_{x} h (x)^{2} \leq C_{h} \sum_{x \in F} \sum_{y \in F} \overline{ξ}_{x} ξ_{y} K (x, y) .

L_{h} (\sum_{x \in F} ξ_{x} K (\cdot, x)) := \sum_{x \in F} ξ_{x} h (x) .

L_{h} (\sum_{x \in F} ξ_{x} K (\cdot, x)) := \sum_{x \in F} ξ_{x} h (x) .

L_{h} (ψ) = ⟨ ψ, H ⟩_{H (K)}

L_{h} (ψ) = ⟨ ψ, H ⟩_{H (K)}

E (\dots) = \int_{Ω} (\dots) d P .

E (\dots) = \int_{Ω} (\dots) d P .

E (f \circ V) = \int_{R (or C)} f d g;

E (f \circ V) = \int_{R (or C)} f d g;

P (V \in B) = \int_{B} d g = g (B)

P (V \in B) = \int_{B} d g = g (B)

P ((V_{1}, \dots, V_{N}) \in B) = g_{N} (B) .

P ((V_{1}, \dots, V_{N}) \in B) = g_{N} (B) .

G_{N} (j_{1}, j_{2}) = \int_{R^{N}} x_{j_{1}} x_{j_{2}} g_{N} (x_{1}, \dots, x_{N}) d x_{1} \dots d x_{N}

G_{N} (j_{1}, j_{2}) = \int_{R^{N}} x_{j_{1}} x_{j_{2}} g_{N} (x_{1}, \dots, x_{N}) d x_{1} \dots d x_{N}

E (\overline{V}_{x} V_{y}) = K (x, y)

E (\overline{V}_{x} V_{y}) = K (x, y)

K_{F} (x, y) = G_{F} (x, y),

K_{F} (x, y) = G_{F} (x, y),

F_{f in} = {A \in F_{M} ∣ 0 < μ (A) < \infty} .

F_{f in} = {A \in F_{M} ∣ 0 < μ (A) < \infty} .

K^{(μ)} (A, B) = μ (A \cap B), A, B \in F_{f in}

K^{(μ)} (A, B) = μ (A \cap B), A, B \in F_{f in}

E (W_{A}^{(μ)} W_{B}^{(μ)}) = μ (A \cap B),

E (W_{A}^{(μ)} W_{B}^{(μ)}) = μ (A \cap B),

(A_{i}) lim i \sum (W_{A_{i}}^{(μ)})^{2} = μ (A) .

(A_{i}) lim i \sum (W_{A_{i}}^{(μ)})^{2} = μ (A) .

A = \cup_{i} A_{i}, A_{i} \cap A_{j} = \emptyset if i \neq = j, and lim μ (A_{i}) = 0.

A = \cup_{i} A_{i}, A_{i} \cap A_{j} = \emptyset if i \neq = j, and lim μ (A_{i}) = 0.

\lim_{\left(A_{i}\right)}\mathbb{E}\left(\big{|}\mu\left(A\right)\mathbbm{1}-\sum\nolimits_{i}(W_{A_{i}}^{\left(\mu\right)})^{2}\Big{|}^{2}\right)=0

\lim_{\left(A_{i}\right)}\mathbb{E}\left(\big{|}\mu\left(A\right)\mathbbm{1}-\sum\nolimits_{i}(W_{A_{i}}^{\left(\mu\right)})^{2}\Big{|}^{2}\right)=0

W^{(μ)} (f) := \int_{M} f (s) d W_{s}^{(μ)},

W^{(μ)} (f) := \int_{M} f (s) d W_{s}^{(μ)},

E (\int_{M} f (s) d W_{s}^{(μ)}^{2}) = \int_{M} ∣ f (s) ∣^{2} d μ (s) .

E (\int_{M} f (s) d W_{s}^{(μ)}^{2}) = \int_{M} ∣ f (s) ∣^{2} d μ (s) .

L^{2} (M, μ) ∋ f ⟼ W^{(μ)} (f) \in L^{2} (Ω, P)

L^{2} (M, μ) ∋ f ⟼ W^{(μ)} (f) \in L^{2} (Ω, P)

K^{(μ)} (A, B) := μ (A \cap B),

K^{(μ)} (A, B) := μ (A \cap B),

Φ (A) = \int_{A} φ d μ,

Φ (A) = \int_{A} φ d μ,

∥ Φ ∥_{H (K^{(μ)})} = ∥ φ ∥_{L^{2} (μ)} .

∥ Φ ∥_{H (K^{(μ)})} = ∥ φ ∥_{L^{2} (μ)} .

\sum_{i = 1}^{N} ξ_{i} Φ (A_{i})^{2} \leq C_{Φ} \sum_{i} \sum_{j} ξ_{i} ξ_{j} K^{(μ)} (A_{i}, A_{j}) .

\sum_{i = 1}^{N} ξ_{i} Φ (A_{i})^{2} \leq C_{Φ} \sum_{i} \sum_{j} ξ_{i} ξ_{j} K^{(μ)} (A_{i}, A_{j}) .

⟨ H, Φ ⟩_{H (K^{(μ)})} \leq ∥ H ∥_{H (K^{(μ)})} ∥ φ ∥_{L^{2} (μ)},

⟨ H, Φ ⟩_{H (K^{(μ)})} \leq ∥ H ∥_{H (K^{(μ)})} ∥ φ ∥_{L^{2} (μ)},

⟨ H, Φ ⟩_{H (K^{(μ)})} = \int_{M} h φ d μ

⟨ H, Φ ⟩_{H (K^{(μ)})} = \int_{M} h φ d μ

H (A) = \int_{A} h d μ;

H (A) = \int_{A} h d μ;

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpectral Theory in Mathematical Physics · Topological and Geometric Data Analysis · Matrix Theory and Algorithms

Full text

\RS@ifundefined

subsecref \newrefsubsecname = \RSsectxt

\RS@ifundefinedthmref \newrefthmname = theorem

\RS@ifundefinedlemref \newreflemname = lemma

\newreflemrefcmd=Lemma LABEL:#1 \newrefthmrefcmd=Theorem LABEL:#1 \newrefcorrefcmd=Corollary LABEL:#1 \newrefsecrefcmd=Section LABEL:#1 \newrefsubrefcmd=Section LABEL:#1 \newrefsubsecrefcmd=Section LABEL:#1 \newrefchaprefcmd=Chapter LABEL:#1 \newrefproprefcmd=Proposition LABEL:#1 \newrefexarefcmd=Example LABEL:#1 \newreftabrefcmd=Table LABEL:#1 \newrefremrefcmd=Remark LABEL:#1 \newrefdefrefcmd=Definition LABEL:#1 \newreffigrefcmd=Figure LABEL:#1

Decomposition of Gaussian processes, and factorization of positive

definite kernels

Palle Jorgensen

(Palle E.T. Jorgensen) Department of Mathematics, The University of Iowa, Iowa City, IA 52242-1419, U.S.A.

[email protected] http://www.math.uiowa.edu/~jorgen/ and

Feng Tian

(Feng Tian) Department of Mathematics, Hampton University, Hampton, VA 23668, U.S.A.

[email protected]

Abstract.

We establish a duality for two factorization questions, one for general positive definite (p.d) kernels $K$ , and the other for Gaussian processes, say $V$ . The latter notion, for Gaussian processes is stated via Ito-integration. Our approach to factorization for p.d. kernels is intuitively motivated by matrix factorizations, but in infinite dimensions, subtle measure theoretic issues must be addressed. Consider a given p.d. kernel $K$ , presented as a covariance kernel for a Gaussian process $V$ . We then give an explicit duality for these two seemingly different notions of factorization, for p.d. kernel $K$ , vs for Gaussian process $V$ . Our result is in the form of an explicit correspondence. It states that the analytic data which determine the variety of factorizations for $K$ is the exact same as that which yield factorizations for $V$ . Examples and applications are included: point-processes, sampling schemes, constructive discretization, graph-Laplacians, and boundary-value problems.

Key words and phrases:

Reproducing kernel Hilbert space, frames, generalized Ito-integration, the measurable category, analysis/synthesis, interpolation, Gaussian free fields, non-uniform sampling, optimization, transform, covariance, feature space.

2000 Mathematics Subject Classification:

Primary 47L60, 46N30, 46N50, 42C15, 65R10, 05C50, 05C75, 31C20, 60J20; Secondary 46N20, 22E70, 31A15, 58J65, 81S25, 68T05.

1 Introduction
2 Positive definite kernels
3 Gaussian processes
4 Sigma-finite measure spaces and Gaussian processes
5 Factorizations and stochastic integrals
6 Examples and applications
7 The case of $\left(k_{x},\mu\right)\in\mathscr{F}\left(K\right)$ when $\mu$ is atomic
8 Point processes: The case when $\left\{\delta_{x}\right\}\subset\mathscr{H}\left(K\right)$
9 Boundary value problems
10 Sampling in $\mathscr{H}\left(K\right)$

1. Introduction

We give an integrated approach to positive definite (p.d.) kernels and Gaussian processes, with an emphasis on factorizations, and their applications. Positive definite kernels serve as powerful tools in such diverse areas as Fourier analysis, probability theory, stochastic processes, boundary theory, potential theory, approximation theory, interpolation, signal/image analysis, operator theory, spectral theory, mathematical physics, representation theory, complex function-theory, moment problems, integral equations, numerical analysis, boundary-value problems for partial differential equations, machine learning, geometric embedding problems, and information theory. While there is no single book which covers all these applications, the reference [PR16] goes some of the way. As for the use of RKHS analysis in machine learning, we refer to [SZ07] and [Wes13].

Here, we give a new and explicit duality for positive definite functions (kernels) on the one hand, and Gaussian processes on the other. A covariance kernel for a general stochastic process is positive definite. In general, the stochastic process in question is not determined by its covariance kernel. But in the special case when the process is Gaussian, it is. In fact (3.1), every p.d. kernel $K$ is indeed the covariance kernel of a Gaussian process. The construction is natural; starting with the p.d. kernel $K$ , there is a canonical inductive limit construction leading to the Gaussian process for this problem, following a realization of Gaussian processes dating back to Kolmogorov. The interplay between analytic properties of p.d. kernels and their associated Gaussian processes is the focus of our present study.

We formulate two different factorization questions, one for general p.d. kernels $K$ , and the other for Gaussian processes, say $V$ . The latter notion, for Gaussian processes, is a subordination approach. Our approach to factorization for p.d. kernels is directly motivated by matrix factorizations, but in infinite dimensions, there are subtle measure theoretic issues involved. If the given p.d. kernel $K$ is already presented as a covariance kernel for a Gaussian process $V$ , we then give an explicit duality for these two seemingly different notions of factorization. Our main result, 5.1, states that the analytic data which determine the variety of factorizations for $K$ is the exact same as that which yield factorizations for $V$ .

2. Positive definite kernels

The notion of a positive definite (p.d.) kernel has come to serve as a versatile tool in a host of problems in pure and applied mathematics. The abstract notion of a p.d. kernel is in fact a generalization of that of a positive definite function, or a positive-definite matrix. Indeed, the matrix-point of view lends itself naturally to the particular factorization question which we shall address in 5 below. The general idea of p.d. kernels arose first in various special cases in the first half of 20th century: It occurs in work by J. Mercer in the context of solving integral operator equations; in the work of G. Szegő and S. Bergmann in the study of harmonic analysis and the theory of complex domains; and in the work by N. Aronszajn in boundary value problems for PDEs. It was Aronszajn who introduced the natural notion of reproducing kernel Hilbert space (RKHS) which will play a central role here; see especially (2.4) below. References covering the areas mentioned above include: [AJL17, Aro50, Hid80, IM65, Jor18, Jr68, JS18b], and [JT16c].

Right up to the present, p.d. kernels have arisen as powerful tools in many and diverse areas of mathematics. A partial list includes the areas listed above in the Introduction. An important new area of application of RKHS theory includes the following [ADD90, AD93, AB97, ADRdS01, ABK02, AM03, AD06, AL08].

Positive definite kernels and their reproducing kernel Hilbert spaces

Let $X$ be a set and let $K$ be a complex valued function on $X\times X$ . We say that $K$ is positive definite (p.d.) iff (Def.) for all finite subset $F$ ( $\subset X$ ) and complex numbers $\left(\xi_{x}\right)_{x\in F}$ , we have:

[TABLE]

In other words, the $\left|F\right|\times\left|F\right|$ matrix $\left(K\left(x,y\right)\right)_{F\times F}$ is positive definite in the usual sense of linear algebra. We refer to the rich literature regarding theory and applications of p.d. functions [AJ12, JT16a, HKL*+*14, RAKK05, CXY15, Sko13, Her12].

We shall also need the Aronszajn [Aro50] reproducing kernel Hilbert spaces (R.K.H.S.), denoted $\mathscr{H}\left(K\right)$ : It is the Hilbert completion of all functions

[TABLE]

where $F$ , and $\left(\xi\right)_{x\in F}$ , are as above.

If $F$ (finite) is fixed, and $\left(\xi_{x}\right)_{x\in F}$ , $\left(\eta_{x}\right)_{x\in F}$ are vectors in $\mathbb{C}^{\left|F\right|}$ , we set

[TABLE]

With the definition of the R.K.H.S. $\mathscr{H}\left(K\right)$ , we get directly that the functions $\left\{K\left(\cdot,x\right)\right\}_{x\in X}$ are automatically in $\mathscr{H}\left(K\right)$ ; and that, for all $h\in\mathscr{H}\left(K\right)$ , we have

[TABLE]

i.e., the reproducing property holds.

Further recall (see e.g. [PR16]) that, given $K$ , then the R.K.H.S. $\mathscr{H}\left(K\right)$ is determined uniquely, up to isometric isomorphism in Hilbert space.

Lemma 2.1.

Let $X\times X\xrightarrow{\;K\;}\mathbb{C}$ be a p.d. kernel, and let $\mathscr{H}\left(K\right)$ be the corresponding RKHS (see (2.3)-(2.4)). Let $h$ be a function defined on $X$ ; then TFAE:

(i)

$h\in\mathscr{H}\left(K\right)$ ; 2. (ii)

there is a constant $C=C_{h}<\infty$ such that, for all finite subset $F\subset X$ , and all $\left(\xi_{x}\right)_{x\in F}$ , $\xi_{x}\in\mathbb{C}$ , the following a priori estimate holds:

[TABLE]

Proof.

The implication (i) $\Rightarrow$ (ii) is immediate, and in this case, we may take $C_{h}=\left\|h\right\|_{\mathscr{H}\left(K\right)}^{2}$ .

Now for the converse, assume (ii) holds for some finite constant. On the $\mathscr{H}\left(K\right)$ -dense span in (2.2), define a linear functional

[TABLE]

From the assumption (2.5) in (ii), we conclude that $L_{h}$ (in (2.6)) is a well defined bounded linear functional on $\mathscr{H}\left(K\right)$ . Initially, $L_{h}$ is only defined on the span (2.2), but by (2.5), it is bounded, and so extends uniquely by $\mathscr{H}\left(K\right)$ -norm limits. We may therefore apply Riesz’ lemma to the Hilbert space $\mathscr{H}\left(K\right)$ , and conclude that there is a unique $H\in\mathscr{H}\left(K\right)$ such that

[TABLE]

for all $\psi\in\mathscr{H}\left(K\right)$ . Now, setting $\psi\left(\cdot\right):=K\left(\cdot,x\right)$ , for $x\in X$ , we conclude from (2.7) that $h\left(x\right)=H\left(x\right)$ ; and so $h\in\mathscr{H}\left(K\right)$ , proving (i). ∎

3. Gaussian processes

The interest in positive definite (p.d.) functions has at least three roots: (i) Fourier analysis, and harmonic analysis more generally; (ii) Optimization and approximation problems, involving for example spline approximations as envisioned by I. Schöenberg; and (iii) Stochastic processes. See [vNS41, Sch83].

Below, we sketch a few details regarding (iii). A stochastic process is an indexed family of random variables based on a fixed probability space. In some cases, the processes will be indexed by some group $G$ , or by a subset of $G$ . For example, $G=\mathbb{R}$ , or $G=\mathbb{Z}$ , correspond to processes indexed by real time, respectively discrete time. A main tool in the analysis of stochastic processes is an associated covariance function.

A process $\{X_{g}\mid g\in G\}$ is called Gaussian if each random variable $X_{g}$ is Gaussian, i.e., its distribution is Gaussian. For Gaussian processes, we only need two moments. So if we normalize, setting the mean equal to [math], then the process is determined by its covariance function. In general, the covariance function is a function on $G\times G$ , or on a subset, but if the process is stationary, the covariance function will in fact be a p.d. function defined on $G$ , or a subset of $G$ . For a systematic study of positive definite functions on groups $G$ , on subsets of groups, and the variety of the extensions to p.d. functions on $G$ , see e.g. [JPT16].

By a theorem of Kolmogorov [Kol83], every Hilbert space may be realized as a (Gaussian) reproducing kernel Hilbert space (RKHS), see 3.1 below, and also [PS75, IM65, SNFBK10].

Now every positive definite kernel is also the covariance kernel of a Gaussian process; a fact which is a point of departure in our present analysis: Given a positive definite kernel, we shall explore its use in the analysis of the associated Gaussian process; and vice versa.

This point of view is especially fruitful when one is dealing with problems from stochastic analysis. Even restricting to stochastic analysis, we have the exciting area of applications to statistical learning theory [SZ07, Wes13].

Let $\left(\Omega,\mathscr{F},\mathbb{P}\right)$ be a probability space, i.e., $\Omega$ is a fixed set (sample space), $\mathscr{F}$ is a specified sigma-algebra (events) of subsets in $\Omega$ , and $\mathbb{P}$ is a probability measure on $\mathscr{F}$ .

A Gaussian random variable is a function $V:\Omega\rightarrow\mathbb{R}$ (in the real case), or $V:\Omega\rightarrow\mathbb{C}$ , such that $V$ is measurable with respect to the sigma-algebra $\mathscr{F}$ on $\Omega$ , and the corresponding sigma-algebra of Borel subsets in $\mathbb{R}$ (or in $\mathbb{C}$ ). Let $\mathbb{E}$ denote the expectation defined from $\mathbb{P}$ , i.e.,

[TABLE]

The requirement on $V$ is that its distribution is Gaussian. If $g$ denotes a Gaussian on $\mathbb{R}$ (or on $\mathbb{C}$ ), the requirement is that

[TABLE]

or equivalently

[TABLE]

for all Borel sets $B$ ; see 3.1.

If $N\in\mathbb{N}$ , and $V_{1},\cdots,V_{N}$ are random variables, the Gaussian requirement is (see 3.2) that the joint distribution of $\left(V_{1},\cdots,V_{N}\right)$ is an $N$ -dimensional Gaussian, say $g_{N}$ , so if $B\subset\mathbb{R}^{N}$ then

[TABLE]

For our present purpose we may restrict to the case where the mean (of the respective Gaussians) is assumed zero. In that case, a finite joint distribution is determined by its covariance matrix. In the $\mathbb{R}^{N}$ case, it is specified as follows (the extension to $\mathbb{C}^{N}$ is immediate) $\left(G_{N}\left(j_{1},j_{2}\right)\right)_{j_{1},j_{2}=1}^{N}$ ,

[TABLE]

where $dx_{1}\cdots dx_{N}=\lambda_{N}$ denotes the standard Lebesgue measure on $\mathbb{R}^{N}$ .

The following is known:

Theorem 3.1 (Kolmogorov [KR60], see also [Hid80, Hid92]).

A kernel $K:X\times X\rightarrow\mathbb{C}$ is positive definite if and only if there is a (mean zero) Gaussian process $\left(V_{x}\right)_{x\in X}$ indexed by $X$ such that

[TABLE]

where $\overline{V}_{x}$ denotes complex conjugation.

Moreover (see Hida [Hid71, Hid92]), the process in (3.6) is uniquely determined by the kernel $K$ in question. If $F\subset X$ is finite, then the covariance kernel for $\left(V_{x}\right)_{x\in F}$ is $K_{F}$ given by

[TABLE]

for all $x,y\in F$ , see (3.5) above.

In the subsequent sections, we shall address a number of properties of Gaussian processes important for their stochastic calculus. Our analysis deals with both the general case, and particular examples from applications. We begin in 4 with certain Wiener processes which are indexed by sigma-finite measures. For this class, the corresponding p.d. kernel has a special form; see (4.1) in 4.1. (The case of fractal measures is part of 6 below.) In 5, we address the general case: We prove our duality result for factorization, 5.1. The remaining sections are devoted to examples and applications.

4. Sigma-finite measure spaces and Gaussian processes

We shall consider functions of $\sigma$ -finite measure space $\left(M,\mathscr{F}_{M},\mu\right)$ where $M$ is a set, $\mathscr{F}_{M}$ a $\sigma$ -algebra of subsets in $M$ , and $\mu$ is a positive measure defined on $\mathscr{F}_{M}$ . It is further assumed that there is a countably indexed $\left(A_{i}\right)_{i\in\mathbb{N}}$ s.t. $0<\mu\left(A_{i}\right)<\infty$ , $M=\cup_{i}A_{i}$ ; and further that the measure space $\left(M,\mathscr{F}_{M},\mu\right)$ is complete; so the Radon-Nikodym theorem holds. We shall also restrict to the case when $\mu$ is assumed non-atomic. The case when $\mu$ is atomic is different, and is addressed in 7 below.

Definition 4.1.

Set

[TABLE]

Note then

[TABLE]

is positive definite. The corresponding Gaussian process $(W_{A}^{\left(\mu\right)})_{A\in\mathscr{F}_{fin}}$ is called the Wiener process [Hid71, Hid92]. In particular, we have

[TABLE]

and

[TABLE]

The precise limit in (4.3), quadratic variation, is as follows: Given $\mu$ as above, and $A\in\mathscr{F}_{fin}$ , we then take limit over the filter of all partitions of $A$ (see (4.4)) relative to the standard notation of refinement:

[TABLE]

Details: Let $\left(\Omega,Cyl,\mathbb{P}\right)$ , $\mathbb{P}=\mathbb{P}^{\left(\mu\right)}$ be the probability space which realizes $W^{\left(\mu\right)}$ as a Gaussian process (or generalized Wiener process), i.e., s.t. (4.2) holds for all pairs in $\mathscr{F}_{fin}$ . In particular, we have that $W_{A}^{\left(\mu\right)}\underset{\left(\text{dist}\right)}{\sim}N\left(0,\mu\left(A\right)\right)$ , i.e., mean zero, Gaussian, and variance = $\mu\left(A\right)$ . Then:

Lemma 4.2 (see e.g., [AJL17]).

With the assumptions as above, we have

[TABLE]

where (in (4.5)) the limit is taken over the filter of all partitions $\left(A_{i}\right)$ of $A$ , and $\mathbbm{1}$ denotes the constant function “one” on $\Omega$ .

As a result, we get the following Ito-integral

[TABLE]

defined for all $f\in L^{2}\left(M,\mathscr{F},\mu\right)$ , and

[TABLE]

We note that the following operator,

[TABLE]

is isometric.

In our subsequent considerations, we shall need the following precise formula (see 4.3) for the RKHS associated with the p.d. kernel

[TABLE]

defined on $\mathscr{F}_{fin}\times\mathscr{F}_{fin}$ . We denote the RKHS by $\mathscr{H}(K^{\left(\mu\right)})$ .

Lemma 4.3.

Let $\mu$ be as above, and let $K^{\left(\mu\right)}$ be the p.d. kernel on $\mathscr{F}_{fin}$ defined in (4.9). Then the corresponding RKHS $\mathscr{H}(K^{\left(\mu\right)})$ is as follows: A function $\Phi$ on $\mathscr{F}_{fin}$ is in $\mathscr{H}(K^{\left(\mu\right)})$ if and only if there is a $\varphi\in L^{2}\left(M,\mathscr{F}_{M},\mu\right)\left(=:L^{2}\left(\mu\right)\right)$ such that

[TABLE]

for all $A\in\mathscr{F}_{fin}$ . Then

[TABLE]

Proof.

To show that $\Phi$ in (4.10) is in $\mathscr{H}(K^{\left(\mu\right)})$ , we must choose a finite constant $C_{\Phi}$ such that, for all finite subset $\left(A_{i}\right)_{i=1}^{N}$ , $A_{i}\in\mathscr{F}_{fin}$ , $\left\{\xi_{i}\right\}_{i=1}^{N}$ , $\xi_{i}\in\mathbb{R}$ , we get the following a priori estimate:

[TABLE]

But a direct application of Schwarz to $L^{2}\left(\mu\right)$ shows that (4.12) holds, and for a finite $C_{\Phi}$ , we may take $C_{\Phi}=\left\|\varphi\right\|_{L^{2}\left(\mu\right)}^{2}$ , where $\varphi$ is the $L^{2}\left(\mu\right)$ -function in (4.10). The desired conclusion now follows from an application of 2.1.

We have proved one implication from the statement of the lemma: Functions $\Phi$ on $\mathscr{F}_{fin}$ of the formula (4.10) are in the RKHS $\mathscr{H}\left(K^{\left(\mu\right)}\right)$ , and the norm $\left\|\cdot\right\|_{\mathscr{H}\left(K^{\left(\mu\right)}\right)}$ is as stated in (4.11). In the below, we shall denote these elements in $\mathscr{H}\left(K^{\left(\mu\right)}\right)$ as pairs $\left(\Phi,\varphi\right)$ . We shall also restrict attention to the case of real valued functions.

For the converse implication, let $H$ be a function on $\mathscr{F}_{fin}$ , and assume $H\in\mathscr{H}\left(K^{\left(\mu\right)}\right)$ . Then by Schwarz applied to $\left\langle\cdot,\cdot\right\rangle_{\mathscr{H}\left(K^{\left(\mu\right)}\right)}$ we get

[TABLE]

where we used (4.11). Hence when Schwarz is applied to $L^{2}\left(\mu\right)$ , we get a unique $h\in L^{2}\left(\mu\right)$ such that

[TABLE]

for all $\left(\Phi,\varphi\right)$ as in (4.10). Now specialize to $\varphi=\chi_{A}$ , $A\in\mathscr{F}_{fin}$ , in (4.14) and we conclude that

[TABLE]

which translates into the assertion that the pair $\left(H,h\right)$ has the desired form (4.10). And hence by (4.11) we have $\left\|H\right\|_{\mathscr{H}\left(K^{\left(\mu\right)}\right)}=\left\|h\right\|_{L^{2}\left(\mu\right)}$ as stated. This concludes the proof of the converse inclusion. ∎

5. Factorizations and stochastic integrals

In Sections 2 and 3, we introduced the related notions of positive definite (p.d.) functions (kernels) on the one hand, and Gaussian processes on the other. One notes the immediate fact that a covariance kernel for a general stochastic process is positive definite. In general, the stochastic process in question is not determined by its covariance kernel. But in the special case when the process is Gaussian, it is.

In 3.1, we stated that every p.d. kernel $K$ is indeed the covariance kernel of a Gaussian process. The construction is natural; starting with the p.d. kernel $K$ , there is a canonical inductive limit construction leading to the Gaussian process for this problem. The basic idea for this particular construction of Gaussian processes dates back to pioneering work by Kolmogorov [Kol83, Hid80].

In the present section, we formulate two different factorization questions, one for general p.d. kernels $K$ , and the other for Gaussian processes, say $V$ . For details, see the respective definitions in (5.2) and (5.3) below. If $K$ is indeed the covariance kernel for a Gaussian process $V$ , it is natural to try to relate these two seemingly different notions of factorization. (In the case of Gaussian processes, a better name is perhaps “subordination” (see (5.10) below), but our theorem justifies the use of factorization in both of these contexts.) Our main result, 5.1, states that the data determining factorization for $K$ is the exact same as that which yields factorization for $V$ .

Let $K$ be a positive definite kernel $X\times X\xrightarrow{\;K\;}\mathbb{C}$ ; and let $V=V_{K}$ be the corresponding Gaussian (mean zero) process, indexed by $X$ , i.e., $V_{x}\in L^{2}\left(\Omega,\mathbb{P}\right)$ , $\forall x\in X$ , and

[TABLE]

We set

[TABLE]

Further, if $V$ is the Gaussian process (from (5.1)), we set

[TABLE]

Following parallel terminology from measure theory, we say that a Gaussian process $V$ admits a disintegration, via suitable Ito-integrals, when there is a measure space with measure $\mu$ such that the corresponding Wiener process $W^{\left(\mu\right)}$ satisfies (5.3). Our theorem below (5.1) shows that this disintegration question may be decided instead by the answer to an equivalent spectral decomposition question; the latter of course formulated for the covariance kernel for $V$ . As is shown in the examples/applications below, given a Gaussian process, it is not at all clear what disintegrations hold; see for example 6.7.

Theorem 5.1.

Let $K:X\times X\rightarrow\mathbb{C}$ be given positive definite, and let $\left\{V_{x}\right\}_{x\in X}$ be the corresponding Gaussian (mean zero) process, then

[TABLE]

Proof.

We shall need the following: ∎

Lemma 5.2.

From the definition of $\mathscr{F}\left(K\right)$ , with $K$ fixed and assumed p.d., we get to every $\left(\left(k_{x}\right)_{x\in X},\mu\right)\in\mathscr{F}\left(K\right)$ a natural isometry $T_{\mu}:\mathscr{H}\left(K\right)\longrightarrow L^{2}\left(M,\mu\right)$ . It is denoted by

[TABLE]

and the adjoint operator $T_{\mu}^{*}:L^{2}\left(M,\mu\right)\longrightarrow\mathscr{H}\left(K\right)$ is as follows: For all $f\in L^{2}\left(M,\mu\right)$ we have

[TABLE]

Moreover, we also have

[TABLE]

Proof.

Since $\left(k_{x},\mu\right)\in\mathscr{F}\left(K\right)$ , we have the factorization property (5.2), and so it follows from (5.5) that this extends by linearity and norm-completion to an isometry $\mathscr{H}\left(K\right)\xrightarrow{\;T_{\mu}\;}L^{2}\left(\mu\right)$ as stated.

By the definition of the adjoint operator $L^{2}\left(\mu\right)\xrightarrow{\;T_{\mu}^{*}\;}\mathscr{H}\left(K\right)$ , we have for $f\in L^{2}\left(\mu\right)$ :

[TABLE]

which is the assertion in the lemma.

From the properties of $\mathscr{H}\left(K\right)$ (see 2), it follows that (5.7) holds iff

[TABLE]

for all $y\in X$ . But we may compute both sides in eq. (5.8) as follows:

[TABLE]

∎

Proof of 5.1 continued.

The proof is divided into two parts, one for each of the inclusions $\subseteq$ and $\supseteq$ in (5.4).

Part 1 “ $\subseteq$ ”. Assume a pair $\left(\left(k_{x}\right)_{x\in X},\mu\right)$ is in $\mathscr{F}\left(K\right)$ ; see (5.2). Then by definition, the factorization (5.3) holds on $X\times X$ . Now let $W^{\left(\mu\right)}$ denote the Wiener process associated with $\mu$ , i.e., $W^{\left(\mu\right)}$ is a Gaussian process indexed by $\mathscr{F}_{fin}$ , and

[TABLE]

for all $A,B\in\mathscr{F}_{fin}$ ; see (4.1) above. Now form the Ito-integral

[TABLE]

We stress that then $V_{x}$ , as defined by (5.10), is a Gaussian process indexed by $X$ . To see this, use the general theory of Ito-integration, see also [JS18b, JT17a, JT17b, JT16c, JT16b, Hid71, Hid80]. The approximation in (5.10) is over the filter of all *partitions *

[TABLE]

see (4.4). From the property of $W_{A_{i}}^{\left(\mu\right)}$ , $i\in\mathbb{N}$ , we conclude that, for all $s_{i}\in A_{i}$ , we have that

[TABLE]

is Gaussian (mean zero) with

[TABLE]

where we used (5.11). Passing to the limit over the filter of all partitions of $M$ (as in (5.11)), we then get

[TABLE]

and with definition (5.10), therefore:

[TABLE]

where the last step in the derivation (5.14) uses the assumption that $\left(\left(k_{x}\right)_{x\in X},\mu\right)\in\mathscr{F}\left(K\right)$ ; see (5.2).

Part 2 “ $\supseteq$ ”. Assume now that some pair $\left(\left(k_{x}\right)_{x\in X},\mu\right)$ is in $\mathscr{M}\left(V\right)$ where $K$ is given assumed p.d.; and where $\left(V_{x}\right)_{x\in X}$ is “the” associated (mean zero) Gaussian process; i.e., with $K$ as its covariance kernel; see (5.1).

We claim that $\left(\left(k_{x}\right)_{x\in X},\mu\right)$ must then be in $\mathscr{F}\left(K\right)$ , i.e., that the factorization (5.3) holds. This in turn follows from the following chain of identities:

[TABLE]

valid for $\forall\left(x,y\right)\in X\times X$ , and the conclusion follows. Note that the first step in the derivation of (5.15) uses the Ito-isometry. Hence, initially $K$ may possibly be the covariance kernel for a mean zero Gaussian process, say $\left(V^{\prime}_{x}\right)$ , different from $V_{x}:=\int_{M}k_{x}\left(s\right)dW_{s}^{\left(\mu\right)}$ . But we proved that the two Gaussian processes $V_{x}$ , and $V_{x}^{\prime}$ , have the same covariance kernel. It follows then the two processes must be equivalent. This is by general theory; see e.g. [Jr68, Itô04, AJL17].

The last uniqueness is only valid since we can consider Gaussian processes. Other stochastic processes are typically not determined uniquely from the respective covariance kernels. ∎

*Remark 5.3**.*

In the statement of 5.1 there are two isometries: Starting with $\left(\left(k_{x}\right)_{x\in X},\mu\right)\in\mathscr{F}\left(K\right)$ we get the canonical isometry $T_{\mu}:\mathscr{H}\left(K\right)\rightarrow L^{2}\left(\mu\right)$ given by

[TABLE]

see (5.5) of 5.2. But with $\mu$ , we then also get the Wiener process $W^{\left(\mu\right)}$ and the Ito-integral

[TABLE]

as an isometry. Here $\left(\Omega,Cyl,\mathbb{P}\right)$ denotes the standard probability space, with $Cyl$ abbreviation for the cylinder sigma-algebra of subsets of $\Omega:=\mathbb{R}^{M}$ . For finite subsets $\left(s_{1},s_{2},\cdots,s_{k}\right)$ in $M$ , and Borel subsets $B_{k}$ in $\mathbb{R}^{k}$ , the corresponding cylinder set

[TABLE]

In summary, we get the the following diagram of isometries, corresponding to a fixed $\left(\left(k_{x}\right)_{x\in X},\mu\right)\in\mathscr{F}\left(K\right)$ , where $K$ is a fixed p.d. function on $X\times X$ :

6. Examples and applications

Below we present four examples in order to illustrate the technical points in 5.1. In the first example $X=\left[0,1\right]$ , the unit interval, and in the next two examples $X=\mathbb{D}=\left\{z\in\mathbb{C}\mathrel{;}\left|z\right|<1\right\}$ the open complex disk. In the fourth example, the Drury-Arveson kernel, we have $X=\mathbb{C}^{k}$ .

We begin with a note on identifications: For $t\in\left[0,1\right]$ , we set

[TABLE]

We write $\lambda_{1}$ for the Lebesgue measure restricted to $\left[0,1\right]$ ; and we make the identification:

[TABLE]

Hence, for $L^{2}\left(\left[0,1\right],\lambda_{1}\right)$ we have the familiar Fourier expansion: With

[TABLE]

On $\left[0,1\right]$ , we shall also consider the Cantor measure $\mu_{4}$ with support equal to the Cantor set

[TABLE]

see 6.1 and [JP98, Jor18].

It is known that $\mu_{4}$ is the unique probability measure s.t.

[TABLE]

For the Fourier transform $\widehat{\mu}_{4}$ we have

[TABLE]

In 6.1, we summarize the three examples with the data from 5.1. We now turn to the details of the respective examples:

Example 6.1.

If $K\left(x,y\right):=x\wedge y$ is considered a kernel on $\left[0,1\right]\times\left[0,1\right]$ , then the corresponding RKHS $\mathscr{H}\left(K\right)$ is the Hilbert space of functions $f$ on $\left[0,1\right]$ such that the distribution derivative $f^{\prime}=df/dx$ is in $L^{2}\left(\left[0,1\right],\lambda_{1}\right)$ , $\lambda_{1}=dx$ , $f\left(0\right)=0$ , and

[TABLE]

and it is immediate that $\left(k_{x},\lambda_{1}\right)\in\mathscr{F}\left(K\right)$ where $k_{x}\left(s\right):=\chi_{\left[0,x\right]}\left(s\right)$ , the indicator function; see 6.2.

The process $W^{\left(\lambda_{1}\right)}$ is of course the standard Brownian motion on $\left[0,1\right]$ , pinned at $x=0$ ; see 6.3, and compare with the $W^{\left(\mu_{4}\right)}$ -process in 6.4. For Monte Carlo simulation, see e.g. [KBTB14, LCRK18].

The Hilbert space characterized by (6.6) is called the Cameron-Martin space, see e.g., [Hid80]. Moreover, to see that (6.6) is indeed the precise characterization of the RKHS for this kernel, one again applies 2.1.

It immediately follows from 5.1 then the Gaussian processes corresponding to the data in 6.1 are as follows:

Example 6.2.

$z\in\mathbb{D}$ :

[TABLE]

realized as an Ito-integral.

As an application of 5.1, we get:

[TABLE]

Example 6.3.

$z\in\mathbb{D}$ :

[TABLE]

were the $W^{\left(\mu_{4}\right)}$ -Ito integral is supported on the Cantor set $C_{4}\subset\left[0,1\right]$ , see 6.1.

As an application of 5.1, we get:

[TABLE]

The reasoning of 6.3 is based on a theorem of the paper [JP98] (see also [Jor18]). Set

[TABLE]

then the Fourier functions $\left\{e\left(\lambda t\right)\mathrel{;}\lambda\in\Lambda_{4}\right\}$ forms an orthonormal basis in $L^{2}\left(C_{4},\mu_{4}\right)$ , i.e., every $f\in L^{2}\left(C_{4},\mu_{4}\right)$ has its Fourier expansion

[TABLE]

and

[TABLE]

Lemma 6.4.

Consider the set $\Lambda_{4}$ in (6.9), and, for $s\in\mathbb{D}$ , let

[TABLE]

be the corresponding generating function. Then we have the following infinite-product representation

[TABLE]

Proof.

From (6.9) we have the following self-similarity for $\Lambda_{4}$ : It is the following identity of sets

[TABLE]

Note that (6.12) is an algorithm for generating points in $\Lambda_{4}$ . Hence,

[TABLE]

and by induction.

Hence, if $s\in\mathbb{D}$ , the infinite-product is absolutely convergent, and the desired product formula (6.11) follows. ∎

*Remark 6.5**.*

Note that, in combination with the theorem from [JP98] (see also [Jor18]), this property of the generating function $F=F_{\Lambda_{4}}$ from 6.4 is used in the derivation of the assertions made about the factorization properties in 6.3; this includes the two formulas (Ex 3) as stated in 6.1; as well as of the verification that $\left(k_{z},\mu_{4}\right)\in\mathscr{F}\left(K\right)$ , where $k_{z}$ , $\mu_{4}$ , and $K$ are as stated.

A direct computation of the two cases, 6.1 and 6.3, is of interest. Our result, 4.3, is useful in the construction: When computing the two Wiener processes $W^{\left(\lambda_{1}\right)}$ and $W^{\left(\mu\right)}$ one notes that the covariance computed on intervals $\left[0,x\right]$ as $0<x<1$ are as follows:

[TABLE]

So the two functions have the representations as in 6.5.

Example 6.6.

The following example illustrates the need for a distinction between $X$ , and families of choices $M$ in 5.1. A priori, one might expect that if $X\times X\xrightarrow{\;K\;}\mathbb{C}$ is given and p.d., it would be natural to try to equip $X$ with a $\sigma$ -algebra $\mathscr{F}_{X}$ of subsets, and a measure $\mu$ such that the condition in (5.2) holds for $\left(X,\mathscr{F}_{X},\mu\right)$ , i.e.,

[TABLE]

with $\left\{k_{x}\right\}_{x\in X}$ a system in $L^{2}\left(X,\mathscr{F}_{X},\mu\right)$ . It turns out that there are interesting examples where this is known to *not *be feasible. The best known such example is perhaps the Drury-Arveson kernel; see [Arv98] and [ARS08, ARS10].

Specifics. Consider $\mathbb{C}^{k}$ for $k\geq 2$ , and $B_{k}\subset\mathbb{C}^{k}$ the complex ball defined for $z=\left(z_{1},\cdots,z_{k}\right)\in\mathbb{C}^{k}$ ,

[TABLE]

For $z,w\in\mathbb{C}^{k}$ , set

[TABLE]

Corollary 6.7 (Arveson [Arv98, Coroll 2]).

Let $k\geq 2$ , and let $\mathscr{H}\left(K_{DA}\right)$ be the RKHS of the D-A kernel in (6.17). Then there is no Borel measure on $\mathbb{C}^{k}$ such that $\left(\mathbb{C}^{k},\mathscr{B}_{k},\mu\right)\in\mathscr{F}\left(K_{DA}\right)$ ; i.e., there is no solution to the formula

[TABLE]

for all $f\left(z\right)$ $k$ -polynomials.

*Remark 6.8**.*

It is natural to ask about disintegration properties for the Gaussian process $V_{DA}$ corresponding to the Drury-Arveson kernel (6.17). Combining our 5.1 above with the corollary (Coroll 6.7), we conclude that, in two or more complex dimensions $k$ , the question of finding the admissible disintegrations this Gaussian process $V_{DA}$ is subtle. It must necessarily involve measure spaces going beyond $\mathbb{C}^{k}$ .

7. The case of $\left(k_{x},\mu\right)\in\mathscr{F}\left(K\right)$

when $\mu$ is atomic

Below we present a case where $\mu$ from pairs in $\mathscr{F}\left(K\right)$ may be chosen to be atomic. The construction is general, but for the sake of simplicity we shall assume that a given p.d. $K$ is such that the RKHS $\mathscr{H}\left(K\right)$ is separable, i.e., when it has an (all) orthonormal basis (ONB) indexed by $\mathbb{N}$ .

Definition 7.1.

Let $\mathscr{H}$ be a Hilbert space (separable), and let $\left\{g_{n}\right\}_{n\in\mathbb{N}}$ be a system of vectors in $\mathscr{H}$ such that

[TABLE]

holds for all $\psi\in\mathscr{H}$ . We then say that $\left\{g_{n}\right\}_{n\in\mathbb{N}}$ is a Parseval frame for $\mathscr{H}$ . (Also see 10.1.)

An equivalent assumption is that the mapping

[TABLE]

is isometric. One checks that then the adjoint $T^{*}:l^{2}\rightarrow\mathscr{H}$ is:

[TABLE]

For general background references on frames in Hilbert space, we refer to [HKLW07, KLZ09, SD13, KOPT13, HJL*+*13, Pes13, CM13, FPWW14, JT17b], and also see [KOPT13, Oko16, WO17, BBCO17, JS18a].

Lemma 7.2.

Let $K$ be given p.d. on $X\times X$ , and assume that $\left\{g_{n}\right\}_{n\in\mathbb{N}}$ is a Parseval frame in $\mathscr{H}\left(K\right)$ ; then

[TABLE]

with the sum on the RHS in (7.3) absolutely convergent.

Proof.

By the reproducing property of $\mathscr{H}\left(K\right)$ , see 2, we get, for all $\left(x,y\right)\in X\times X$ :

[TABLE]

∎

Now a direct application of the argument in the proof of 5.1 yields the following:

Corollary 7.3.

Let $K$ be given p.d. on $X\times X$ such that $\mathscr{H}\left(K\right)$ is separable, and let $\left\{g_{n}\right\}_{n\in\mathbb{N}}$ be a Parseval frame, for example an ONB in $\mathscr{H}\left(K\right)$ . Let $\left\{\zeta_{n}\right\}_{n\in\mathbb{N}}$ be a chosen system of i.i.d. (independent identically distributed) system of standard Gaussians, i.e., with $N\left(0,1\right)$ -distribution $\nicefrac{{1}}{{\sqrt{2\pi}}}e^{\nicefrac{{-s^{2}}}{{2}}}$ , $s\in\mathbb{R}$ . Then the following sum defines a Gaussian process,

[TABLE]

i.e., $\left\{V_{x}\right\}_{x\in X}$ is well-defined in $L^{2}\left(\Omega,Cyl,\mathbb{P}\right)$ , as stated, where $\Omega=\mathbb{R}^{\mathbb{N}}$ as a realization in an infinite Cartesian product with the usual cylinder $\sigma$ -algebra, and $\left\{V_{x}\right\}_{x\in X}$ has $K$ as covariance kernel, i.e.,

[TABLE]

see (5.15).

Proof.

This is a direct application of 7.2, and we leave the remaining verifications to the reader. ∎

8. Point processes: The case when $\left\{\delta_{x}\right\}\subset\mathscr{H}\left(K\right)$

Let $X\times X\xrightarrow{\;K\;}\mathbb{R}$ be a fixed positive definite kernel. We know that the RKHS $\mathscr{H}\left(K\right)$ consists of functions $h$ on $X$ subject to the a priori estimate in 2.1. For recent work on point-processes over infinite networks [JP19, JP14, JT18, JT16b, JT15b, GD18, QLS18, NP19, CH18], the case when the Dirac measures $\delta_{x}$ are in $\mathscr{H}\left(K\right)$ is of special significance. In this case there is an abstract Laplace operator $\Delta$ , defined as follows:

[TABLE]

For the $\left\|\cdot\right\|_{\mathscr{H}\left(K\right)}$ -norm of $\delta_{x}$ , we have

[TABLE]

immediate from (8.1).

For every finite subset $F\subset X$ , we consider the induced $\left|F\right|\times\left|F\right|$ matrix

[TABLE]

Note that $K_{F}$ is a positive definite square matrix. Its spectrum consists of eigenvalues $\lambda_{s}\left(F\right)$ .

If $\left(K,X\right)$ is as described, i.e., $X\times X\xrightarrow{\;K\;}\mathbb{R}\left(\text{or }\mathbb{C}\right)$ p.d., and if

[TABLE]

we shall see that $X$ must then be discrete. (In interesting cases, also countable.) If (8.4) holds, we shall say that $\left(K,X\right)$ is a point process. We shall further show that point processes arise by restriction as follows:

Let $\left(K,X\right)$ be given with $K$ a p.d. kernel. If a countable subset $S\subset X$ is such that $K^{\left(S\right)}:=K\big{|}_{S\times S}$ has

[TABLE]

then we shall say that $\left(K^{\left(S\right)},S\right)$ is an induced point process.

8.1. Nets of finite submatrices, and their limits

Given $(K,X)$ as above with $K$ p.d. and defined on $X\times X$ . Then the finite submatrices in the subsection header are indexed by the net of all finite subsets $F$ of $X$ as follows: Given $F$ , then the corresponding $\left|F\right|\times\left|F\right|$ square matrix $K_{F}$ is simply the restriction of $K$ to $F\times F$ . Of course, each matrix $K_{F}$ is positive definite, and so it has a finite list of eigenvalues. These eigenvalue lists figure in the discussion below.

Lemma 8.1.

Let $K$ , $F$ , and $K_{F}$ be as above, with $\lambda_{s}\left(F\right)$ denoting the numbers in the list of eigenvalues for the matrix $K_{F}$ . Then

[TABLE]

Proof.

Consider the eigenvalue equation

[TABLE]

From 2.1 and for $x\in F$ , we then get

[TABLE]

Now apply $\sum_{x\in F}$ to both sides in (8.8), and the desired conclusion (8.6) follows. ∎

*Remark 8.2**.*

A consequence of the lemma is that the matrices $K_{F}^{-1}$ and $K_{F}^{-1/2}$ automatically are well defined (by the spectral theorem) with associated spectral bounds.

Definition 8.3.

Let $K$ , $F$ , and $K_{F}$ be as above; and with the condition $\delta_{x}\in\mathscr{H}\left(K\right)$ in force. Set

[TABLE]

It is a finite-dimensional (and therefore closed) subspace in $\mathscr{H}\left(K\right)$ . The orthogonal projection onto $\mathscr{H}_{K}\left(F\right)$ will be denoted $P_{F}:\mathscr{H}\left(K\right)\rightarrow\mathscr{H}_{K}\left(F\right)$ .

Lemma 8.4.

Let $K$ , $F$ , $K_{F}$ , and $\mathscr{H}_{K}\left(F\right)$ be as above. Then the orthogonal projection $P_{F}$ is as follows: For $h\in\mathscr{H}\left(K\right)$ , set $h_{F}=h\big{|}_{F}$ , restriction:

[TABLE]

Proof.

It is immediate from the definition that $P_{F}h$ has the form

[TABLE]

with $\left(\xi_{y}\right)_{y\in F}\in\mathbb{C}^{\left|F\right|}$ . Since $P_{F}$ is the orthogonal projection,

[TABLE]

(orthogonality in the $\mathscr{H}\left(K\right)$ -inner product) which yields:

[TABLE]

and therefore, $\xi=K_{F}^{-1}h_{F}$ , which is the desired formula (8.10). ∎

Corollary 8.5.

Let $X$ , $K$ , $\mathscr{H}\left(K\right)$ be as above, and assume $\delta_{x}\in\mathscr{H}\left(K\right)$ for some $x\in X$ . Then a function $h$ on $X$ is in $\mathscr{H}\left(K\right)$ if and only if

[TABLE]

where the supremum is over all finite subsets $F$ of $X$ . If $h$ is finite energy, then

[TABLE]

Proof.

The proof follows from an application of Hilbert space geometry to the RKHS $\mathscr{H}\left(K\right)$ , on the family of orthogonal projections $P_{F}$ indexed by the finite subsets $F$ in $X$ . With the standard lattice operations, applied to projections, we have $\sup_{F}P_{F}=I_{\mathscr{H}\left(K\right)}$ . The conclusions (8.13)-(8.14) follow from this since, by the lemma,

[TABLE]

∎

*Remark 8.6**.*

The advantage with the use of this system of orthogonal projections $P_{F}$ , indexed by the finite subsets $F$ of $X$ , is that we may then take advantage of the known lattice operations for orthogonal projections in Hilbert space. But it is important that we get approximation with respect to the canonical norm in the RKHS $\mathscr{H}\left(K\right)$ . This works because by our construction, the orthogonality properties for the projections $P_{F}$ refers precisely to the inner product in $\mathscr{H}\left(K\right)$ . Naturally we get the best $\mathscr{H}\left(K\right)$ -approximation properties when $X$ is further assumed countable. But the formula for the $\mathscr{H}\left(K\right)$ -norm holds in general.

Corollary 8.7.

Let $X\times X\xrightarrow{\;K\;}\mathbb{C}$ be fixed, assumed p.d., and let $\mathscr{H}\left(K\right)$ be the corresponding RKHS. Let $x\in X$ be given. Then $\delta_{x}\in\mathscr{H}\left(K\right)$ if and only if

[TABLE]

In this case, we have:

[TABLE]

Proof.

The result is immediate from 8.5 applied to $h:=\delta_{x}$ , where $x$ is fixed. Here the terms in (8.14) are, for $F$ finite, $x\in F$ :

[TABLE]

and the stated conclusion is now immediate. ∎

Corollary 8.8.

Let $X$ , $K$ , and $\mathscr{H}\left(K\right)$ be as above, but assume now that $X$ is countable, with a monotone net of finite sets:

[TABLE]

then a function $h$ on $X$ is in $\mathscr{H}\left(K\right)$ iff $\sup_{i}\left\|K_{F_{i}}^{-1/2}h\big{|}_{F_{i}}\right\|_{l^{2}\left(F_{i}\right)}<\infty$ .

Moreover,

[TABLE]

where, the convergence in (8.19) is monotone.

Proof.

From the definition of the order of orthogonal projections, we have

[TABLE]

and therefore,

[TABLE]

with $\lim_{i\rightarrow\infty}\left\|P_{F_{i}}h\right\|_{\mathscr{H}\left(K\right)}^{2}=\left\|h\right\|_{\mathscr{H}\left(K\right)}^{2}$ . But by (8.15) and the proof of 8.5, we have

[TABLE]

and, so, by (8.21), we get:

[TABLE]

The conclusion now follows. ∎

8.2. Restrictions of p.d. kernels

Below we shall be considering pairs $(K,X)$ with $K$ a fixed p.d. kernel defined on $X\times X$ , and, as before, we denote by $\mathscr{H}\left(K\right)$ the corresponding RKHS with its canonical inner product. In general, $X$ is an arbitrary set, typically of large cardinality, in particular uncountable: It may be a complex domain, a generalized boundary, or it may be a manifold arising from problems in physics, in signal processing, or in machine learning models. Moreover, for such general pairs $(K,X)$ , with $K$ a fixed p.d. kernel, the Dirac functions $\delta_{x}$ are typically not in $\mathscr{H}\left(K\right)$ .

Here we shall turn to induced systems, indexed by suitable countable discrete subsets $S$ of $X$ . Indeed, for a number of sampling or interpolation problems, it is possible to identify countable discrete subsets $S$ of $X$ , such that when $K$ is restricted to $S\times S$ , i.e., $K^{\left(S\right)}:=K\big{|}_{S\times S}$ , then for $x\in S$ , the Dirac functions $\delta_{x}$ will be in $\mathscr{H}\left(K^{\left(S\right)}\right)$ ; i.e., we get induced point processes indexed by $S$ . In fact, with 8.8, we will be able to identify a variety of such subsets $S$ .

Moreover, each such choice of subset $S$ yields point-process, and an induced graph, and graph Laplacian; see (8.1)-(8.2). These issues will be taken up in detail in the two subsequent sections. In the following 8.9, for illustration, we identify a particular instance of this, when $X=\mathbb{R}$ (the reals), and $S=\mathbb{Z}$ (the integers), and where $K$ is the covariance kernel of standard Brownian motion on $\mathbb{R}$ .

Example 8.9 (**Discretizing the covariance function for Brownian motion on

$\mathbb{R}$ ).**

The present example is a variant of 6.1, but with $X=\mathbb{R}$ (instead of the interval $\left[0,1\right]$ ). We now set

[TABLE]

It is immediate that (6.6) in 6.1 carries over, but now with $\mathbb{R}$ in place of $\left[0,1\right]$ . The normalization $f\left(0\right)=0$ is carried over. We get that: A function $f\left(x\right)$ on $\mathbb{R}$ is in $\mathscr{H}\left(K\right)$ iff it has distribution-derivative $f^{\prime}=df/dx$ in $L^{2}\left(\mathbb{R}\right)$ , see (8.23). As before, we conclude that the $\mathscr{H}\left(K\right)$ -norm is:

[TABLE]

*Remark 8.10**.*

The determinant of $K_{F_{N}}^{\left(\mathbb{Z}\right)}$ is 1 for all $N$ . Proof. By eliminating the first column, and then the first row, $\det(K_{F_{N}}^{\left(\mathbb{Z}\right)})$ is reduced to $\det(K_{F_{N-1}}^{\left(\mathbb{Z}\right)})$ . So by induction, the determinant is 1.

Note that

[TABLE]

which yields the factorization

[TABLE]

i.e.,

[TABLE]

where $A_{N}$ is the $N\times N$ lower triangular matrix given by

[TABLE]

In particular, we get that $\det(K_{F_{N}}^{\left(\mathbb{Z}\right)})=1$ immediately. This is a special case of 5.1.

For the general case, let $F_{N}=\left\{x_{j}\right\}_{j=1}^{N}$ be a finite subset of $\mathbb{R}$ , assuming $x_{1}<x_{2}<\cdots<x_{N}$ . Then the factorization (8.30) holds with

[TABLE]

Thus,

[TABLE]

In the setting of 5 (finite sums of standard Gaussians), we have the following: Let $\left\{x_{i}\right\}_{i=1}^{N}$ be as in (8.31), and let $1\leq n,m\leq N$ . Let $\left\{Z_{i}\right\}_{i=1}^{N}$ be a system i.i.d. standard Gaussians $N\left(0,1\right)$ , i.e., independent identically distributed. Set

[TABLE]

Then one checks that

[TABLE]

which is the desired Gaussian realization of $K$ .

Alternatively, $K_{F_{N}}^{\left(\mathbb{Z}\right)}$ assumes the following factorization via non-square matrices: Assume $F_{N}\subset\mathbb{Z}_{+}$ , then

[TABLE]

where $A$ is the $N\times x_{N}$ matrix such that

[TABLE]

That is, $A$ takes the form:

[TABLE]

$x_{1}$$x_{2}$$x_{3}$$x_{N}$

*Remark 8.11** (Spectrum of the matrices $K_{F}$ ; see also [HHT13]).*

It is known that the factorization as in (8.30) can be used to obtain the spectrum of positive definite matrices. The algorithm is as follows: Let $K$ be a given p.d. matrix.

Initialization: $B:=K$ ;

Iterations: $k=1,2,\cdots,n-1$ ,

(i)

$B=AA^{*}$ ; 2. (ii)

$B=A^{*}A$ ;

Here $A$ in step (i) denotes the lower triangular matrix in the Cholesky decomposition of $B$ (see (8.30)). Then $\lim_{n\rightarrow\infty}B$ converges to a diagonal matrix consisting of the eigenvalues of $K$ .

We now resume consideration of the general case of p.d. kernels $K$ on $X\times X$ and their restrictions: A setting for harmonic functions.

*Remark 8.12**.*

In the general case of (8.2) and 8.1, we still have a Laplace operator $\Delta$ . It is a densely defined symmetric operator on $\mathscr{H}\left(K\right)$ . Moreover (general case),

[TABLE]

(assuming that $\delta_{x}\in\mathscr{H}\left(K\right)$ ). The dot “ $\cdot$ ” in (8.37) refers to the action variable for the operator $\Delta$ . In other words, $K\left(\cdot,\cdot\right)$ is a generalized Greens kernel.

Definition 8.13.

Let $X\times X\xrightarrow{\;K\>}\mathbb{C}$ be given p.d., and assume

[TABLE]

Let $\Delta$ denote the induced Laplace operator. A function $h$ (in $\mathscr{H}\left(K\right)$ ) is said to be harmonic iff (Def.) $\Delta h=0$ .

Corollary 8.14.

Let $\left(X,K,\mathscr{H}\left(K\right)\right)$ be as above. Assume (8.38), and let $\Delta$ be the induced Laplace operator. Then we have the following orthogonal decomposition for $\mathscr{H}\left(K\right)$ :

[TABLE]

where “clospan” in (8.39) refers to the norm in $\mathscr{H}\left(K\right)$ .

Proof.

It is immediate from (8.1) that

[TABLE]

where the orthogonality “ $\perp$ ” in (8.40) refers to the inner product $\left\langle\cdot,\cdot\right\rangle_{\mathscr{H}\left(K\right)}$ . Since, by Hilbert space geometry, $\left(\left\{\delta_{x}\right\}_{x\in X}\right)^{\perp\perp}=clospan^{\mathscr{H}\left(K\right)}\left(\left\{\delta_{x}\right\}_{x\in X}\right)$ , we only need to observe that $\left\{h\in\mathscr{H}\left(K\right)\mathrel{;}\Delta h=0\right\}$ is closed in $\mathscr{H}\left(K\right)$ . But this is immediate from (8.1). ∎

Corollary 8.15 (Duality).

Let $X\times X\xrightarrow{\;K\;}\mathbb{R}$ be given, assumed p.d., and let $S\subset X$ be a countable subset such that

[TABLE]

(i)

Then the following duality holds for the two induced kernels:

[TABLE]

both p.d. kernels on $S\times S$ .

For every pair $x,y\in S$ , we have the following matrix-inversion formula:

[TABLE]

where the summation on the LHS in (8.44) is a limit over a net of finite subsets $\left\{F_{i}\right\}_{i\in\mathbb{N}}$ , $F_{1}\subset F_{2}\subset\cdots$ , s.t. $\cup_{i}F_{i}=S$ ; and the result is independent of choice of net. 2. (ii)

We get an induced graph with $S$ as the set of vertices, and edge set $E$ as follows: $E\subset\left(S\times S\right)\backslash\left(\text{diagonal}\right)$ .

An edge is a pair $\left(x,y\right)\in\left(S\times S\right)\backslash\left(\text{diagonal}\right)$ such that

[TABLE]

Proof.

The result follows from an application of Corollaries 8.7 and 8.8, and 8.12. ∎

Let $X$ , $K$ , and $S$ be as stated, $S$ countable infinite, with assumptions as in the previous two results. We showed that then the subset $S$ acquires the structure of a vertex set in an induced infinite graph (8.15 (ii)). If $\Delta$ denotes the corresponding graph Laplacian, then the following boundary value problem is of great interest: Make precise the boundary conditions at “infinity” for this graph Laplacian $\Delta$ . An answer to this will require identification of Hilbert space, and limit at “infinity.” The result below is such an answer, and the limit notion will be, limit over the filter of all finite subsets in $S$ ; see 8.7. Another key tool in the arguments below will again be the net of orthogonal projections $\left\{P_{F}\right\}$ from 8.4, and the convergence results from Corollaries 8.5 and 8.7.

Corollary 8.16.

Let $X\times X\xrightarrow{\;K\;}\mathbb{R}$ , and $S\subset X$ be as in the statement of 8.15. Let $\mathscr{F}_{fin}\left(S\right)$ denote the filter of finite subsets $F\subset S$ . Let $\Delta=\Delta_{S}$ be the graph Laplacian defined in (8.2), i.e.,

[TABLE]

for all $x\in S$ , $h\in\mathscr{H}(K^{\left(S\right)})$ . Then the following equivalent conditions hold:

(i)

For all $h\in\mathscr{H}(K^{\left(S\right)})$ ,

[TABLE] 2. (ii)

For $\forall F\in\mathscr{F}_{fin}\left(S\right)$ , $x\in F$ , $h\in\mathscr{H}(K^{\left(S\right)})$ ,

[TABLE] 3. (iii)

$K_{F}\Delta P_{F}h=h\big{|}_{F}$ .

Proof.

On account of 8.8, we only need to verify (8.46). Let $F\in\mathscr{F}_{fin}\left(S\right)$ , $h\in\mathscr{H}(K^{\left(S\right)})$ , then we proved that

[TABLE]

Now apply $\left\langle\delta_{x},\cdot\right\rangle_{\mathscr{H}(K^{\left(S\right)})}$ to both sides in (8.47); and we get

[TABLE]

where we used $\left\langle\delta_{x},K\left(\cdot,y\right)\right\rangle_{\mathscr{H}(K^{\left(S\right)})}=\delta_{x,y}$ . The desired conclusion (8.46) now follows from (8.49). Also note that $\left(\Delta\left(P_{F}h\right)\right)\left(x\right)=0$ if $x\in X\backslash F$ . ∎

8.3. Canonical isometries computed from point processes

Below we consider p.d. kernels $K$ defined initially on $X\times X$ . Our present aim is to consider restrictions to $S\times S$ when $S$ is a suitable subset of $X$ . Our first observation is the identification of a canonical isometry $T_{S}$ between the respective reproducing kernel Hilbert spaces; $T_{S}$ identifying $\mathscr{H}(K^{\left(S\right)})$ as an isometric subspace inside $\mathscr{H}(K)$ . This isometry $T_{S}$ exists in general. However, we shall show that, when the subset $S$ is further restricted, the respective RKHSs, and isometry $T_{S}$ will admit explicit characterizations. For example, if $S$ is countable, and is the Dirac functions $\delta_{s}$ , $s\in S$ , are in $\mathscr{H}(K^{\left(S\right)})$ we shall show that this setting leads to a point process. In this case, we further identify an induced (infinite) graph with the set $S$ as vertices, and with associated edges defined by an induced $\delta_{s}$ kernel.

Theorem 8.17.

Let $X\times X\xrightarrow{\;K\;}\mathbb{C}$ be a p.d. kernel, and let $S\subset X$ be a subset. Set $K^{\left(S\right)}:=K\big{|}_{S\times S}$ . Let $\mathscr{H}\left(K\right)$ , and $\mathscr{H}(K^{\left(S\right)})$ , be the respective RKHSs.

(i)

Then there is a canonical isometric embedding

[TABLE]

given by the following formula: For $s\in S$ , set

[TABLE]

(Note that $K^{\left(S\right)}\left(\cdot,s\right)$ on the LHS in (8.50) is a function on $S$ , while $K\left(\cdot,s\right)$ on the RHS is a function on $X$ .) 2. (ii)

The adjoint operator $T^{*}$ ,

[TABLE]

is given by restriction, i.e., if $f\in\mathscr{H}(K)$ , and $s\in S$ , then $\left(T^{*}f\right)\left(s\right)=f\left(s\right)$ ; or equivalently, for all $f\in\mathscr{H}(K)$ ,

[TABLE]

Proof.

To show that $T$ in (8.50) is isometric, proceed as follows: Let $\left\{s_{i}\right\}_{i=1}^{N}$ be a finite subset of $S$ , and $\left\{\xi_{i}\right\}_{i=1}^{N}\in\mathbb{C}^{N}$ , then

[TABLE]

which is the desired isometric property.

We now turn to (8.52), the restriction formula: Let $s\in S$ , and $f\in\mathscr{H}\left(K\right)$ , then

[TABLE]

But, for the LHS in (8.3), we have

[TABLE]

and so the desired formula (8.52) follows. ∎

*Remark 8.18**.*

The canonical isometry for 8.9 ( $\mathbb{Z}$ -discretization of the covariance function for Brownian motion on $\mathbb{R}$ ). From 8.17, we know that the canonical isometry $T$ maps $\mathscr{H}(K^{\left(Z\right)})$ into $\mathscr{H}\left(K\right)$ ; see (8.22). But (8.23) and (8.25) in the Example offer exact characterization of these two Hilbert spaces. So, in the special case of 8.9, the canonical isometry $T$ maps from functions $\Phi$ on $\mathbb{Z}$ into functions on $\mathbb{R}$ . In view of (8.23), this assignment turns out to be a precise spline realization of the point grids realized by these sequences $\Phi$ .

Below we present an explicit formula, and graphics, for the spline realizations. By (8.26), the embedding of $\delta_{n}$ from $\mathscr{H}(K^{\left(\mathbb{Z}\right)})$ into $\mathscr{H}\left(K\right)$ is given by

[TABLE]

See 8.1. Therefore, for all $h\in\mathscr{H}\left(K\right)$ , we get

[TABLE]

which is the spline interpolation.

Corollary 8.19.

Let $X\times X\xrightarrow{\;K\;}\mathbb{C}$ be a p.d. kernel, and let $S\subset X$ be a subset. Assume further that $\left\{\delta_{s}\right\}_{s\in S}\subset\mathscr{H}(K^{\left(S\right)})$ . Then every finitely supported function $h$ on $S$ is in $\mathscr{H}(K^{\left(S\right)})$ , and we have the following generalized spline interpolation; i.e., isometrically extending $h$ from $S$ to $X$ :

[TABLE]

where $F_{0}=suppt\left(h\right)$ , and the sup is taken over the filter of all finite subsets of $X$ containing $F_{0}$ .

Proof.

Assume $h\in\mathscr{H}(K^{\left(S\right)})$ , supported on a finite subset $F_{0}\subset S$ . Then,

[TABLE]

where the last step follows from (8.10), and $P_{F}$ is the orthogonal projection from $\mathscr{H}\left(K\right)$ onto the subspace $\mathscr{H}_{K}\left(F\right)$ . ∎

Corollary 8.20.

Let $X\times X\xrightarrow{\;K\;}\mathbb{C}$ , p.d.. be given, and let $S\subset X$ be a subset. Let $T=T_{S}$ , $\mathscr{H}(K^{\left(S\right)})\xrightarrow{\;T\;}\mathscr{H}\left(K\right)$ , be the canonical isometry. Then a function $f$ in $\mathscr{H}\left(K\right)$ satisfies $\left\langle f,T(\mathscr{H}(K^{\left(S\right)}))\right\rangle_{\mathscr{H}\left(K\right)}=0$ if and only if

[TABLE]

Proof.

Immediate from part (ii) in 8.17. ∎

*Remark 8.21**.*

Let $\left(X,K,S\right)$ be as in 8.20, and let $T_{S}$ be the canonical isometry. Let $P_{S}:=T_{S}T_{S}^{*}$ be the corresponding projection. Then $I_{\mathscr{H}\left(K\right)}-P_{S}$ is the projection onto the subspace given in (8.55).

Corollary 8.22.

Let $X\times X\xrightarrow{\;K\;}\mathbb{C}$ be given p.d.; and let $S\subset X$ be a subset with induced kernel

[TABLE]

Consider the two sets $\mathscr{F}\left(S\right)$ and $\mathscr{F}(K^{\left(S\right)})$ from (5.2) and 5.1. Let $T_{S}:\mathscr{H}(K^{\left(S\right)})\rightarrow\mathscr{H}\left(K\right)$ be the canonical isometry (8.50) in 8.17. Then the following implication holds:

[TABLE]

Proof.

Assuming (8.22), we get the representation (5.2):

[TABLE]

But then, for all $\left(s_{1},s_{2}\right)\in S\times S$ , we then have

[TABLE]

which is the desired conclusion. ∎

9. Boundary value problems

Our setting in the present section is the discrete case, i.e., RKHSs of functions defined on a prescribed countable infinite discrete set $S$ . We are concerned with a characterization of those RKHSs $\mathscr{H}$ which contain the Dirac masses $\delta_{x}$ for all points $x\in S$ . Of the examples and applications where this question plays an important role, we emphasize two: (i) discrete Brownian motion-Hilbert spaces, i.e., discrete versions of the Cameron-Martin Hilbert space; (ii) energy-Hilbert spaces corresponding to graph-Laplacians.

The problems addressed here are motivated in part by applications to analysis on infinite weighted graphs, to stochastic processes, and to numerical analysis (discrete approximations), and to applications of RKHSs to machine learning. Readers are referred to the following papers, and the references cited there, for details regarding this: [AJS14, AJ12, AJL11, JPT15, JP14, JP11, DG13, Kre13, ZXZ09, Nas84, NS13].

The discrete case can be understood as restrictions of analogous PDE-models. In traditional numerical analysis, one builds discrete and algorithmic models (finite element methods), each aiming at finding approximate solutions to PDE-boundary value problems. They typically use multiresolution-subdivision schemes, applied to the continuous domain, subdividing into simpler discretized parts, called finite elements. And with variational methods, one then minimize various error-functions. In this paper, we turn the tables: our object of study are the discrete models, and analysis of suitable continuous PDE boundary problems serve as a tool for solutions in the discrete world.

Definition 9.1.

Let $X\times X\xrightarrow{\;K\;}\mathbb{C}$ be a given p.d. kernel on $X$ . The RKHS $\mathscr{H}=\mathscr{H}\left(K\right)$ is said to have the discrete mass property ( $\mathscr{H}$ is called a discrete RKHS), if $\delta_{x}\in\mathscr{H}$ , for all $x\in X$ .

In fact, it is known ([JT16a]) that every fundamental solution for a Dirichlet boundary value problem on a bounded open domain $\Omega$ in $\mathbb{R}^{\nu}$ , allows for discrete restrictions (i.e., vertices sampled in $\Omega$ ), which have the desired “discrete mass” property.

We recall the following result to stress the distinction of the discrete models vs their continuous counterparts.

Let $\Omega$ be a bounded, open, and connected domain in $\mathbb{R}^{\nu}$ with smooth boundary $\partial\Omega$ . Let $K:\Omega\times\Omega\rightarrow\mathbb{R}$ continuous, p.d., given as the Green’s function of $\Delta_{0}$ , where

[TABLE]

for the Dirichlet boundary condition. Thus, $\Delta_{0}$ is positive selfadjoint, and

[TABLE]

Let $\mathscr{H}_{CM}\left(\Omega\right)$ be the corresponding Cameron-Martin RKHS.

For $\nu=1$ , $\Omega=\left(0,1\right)$ , take

[TABLE]

For $\nu>1$ , let

[TABLE]

Theorem 9.2.

Let $\Omega$ , and $S\subset\Omega$ , be given. Then

(i)

Discrete case: Fix $S\subset\Omega$ , $\#S=\aleph_{0}$ , where $S=\left\{x_{j}\right\}_{j=1}^{\infty}$ , $x_{j}\in\Omega$ . Assume $\exists\varepsilon>0$ s.t. $\left\|x_{i}-x_{j}\right\|\geq\varepsilon$ , $\forall i,j$ , $i\neq j$ . Let

[TABLE]

then $\delta_{x_{j}}\in\mathscr{H}\left(S\right)$ . 2. (ii)

Continuous case; by contrast: $K_{x}^{\left(S\right)}\in\mathscr{H}_{CM}\left(S\right)$ , but $\delta_{x}\notin\mathscr{H}_{CM}\left(\Omega\right)$ , $x\in\Omega$ .

Proof.

The result follows from an application of Corollaries 8.7 and 8.8. It extends earlier results [JT15a, JT16a] by the co-authors. ∎

10. Sampling in $\mathscr{H}\left(K\right)$

In the present section, we study classes of reproducing kernels $K$ on general domains with the property that there are non-trivial restrictions to countable discrete sample subsets $S$ such that every function in $\mathscr{H}\left(K\right)$ has an $S$ -sample representation. In this general framework, we study properties of positive definite kernels $K$ with respect to sampling from “small” subsets, and applying to all functions in the associated Hilbert space $\mathscr{H}\left(K\right)$ .

We are motivated by concrete kernels which are used in a number of applications, for example, on one extreme, the Shannon kernel for band-limited functions, which admits many sampling realizations; and on the other, the covariance kernel of Brownian motion which has no non-trivial countable discrete sample subsets.

Definition 10.1.

Let $X\times X\xrightarrow{\;K\;}\mathbb{C}$ be a p.d. kernel, and $\mathscr{H}\left(K\right)$ be the associated RKHS. We say that $K$ has non-trivial sampling property, if there exists a countable subset $S\subset X$ , and $a,b\in\mathbb{R}_{+}$ , such that

[TABLE]

If equality holds in (10.1) with $a=b=1$ , then we say that $\left\{K\left(\cdot,s\right)\right\}_{s\in S}$ is a Parseval frame. (Also see 7.1.)

It follows that sampling holds in the form

[TABLE]

if and only if $\left\{K\left(\cdot,s\right)\right\}_{s\in S}$ is a Parseval frame.

Lemma 10.2.

Suppose $K$ , $X$ , $a$ , $b$ , and $S$ satisfy the condition in (10.1), then the linear span of $\left\{K\left(\cdot,s\right)\right\}_{s\in S}$ is dense in $\mathscr{H}\left(K\right)$ . Moreover, there is a positive operator $B$ in $\mathscr{H}\left(K\right)$ with bounded inverse such that

[TABLE]

is a convergent interpolation formula valid for all $f\in\mathscr{H}\left(K\right)$ .

Equivalently,

[TABLE]

Proof.

Define $A:\mathscr{H}\left(K\right)\rightarrow l^{2}\left(S\right)$ by $\left(Af\right)\left(s\right)=f\left(s\right)$ , $s\in S$ . Then the adjoint operator $A^{*}:l^{2}\left(S\right)\rightarrow\mathscr{H}\left(K\right)$ is given by $A^{*}\xi=\sum_{s\in S}\xi_{s}K\left(\cdot,s\right)$ , $\forall\xi\in l^{2}\left(S\right)$ , and

[TABLE]

holds in $\mathscr{H}\left(K\right)$ , with $\mathscr{H}\left(K\right)$ -norm convergence. Now set $B=\left(A^{*}A\right)^{-1}$ , and note that $\left\|B\right\|_{\mathscr{H}\left(K\right)\rightarrow\mathscr{H}\left(K\right)}\leq a^{-1}$ , where $a$ is in the lower bound in (10.1). ∎

Theorem 10.3.

Let $K:X\times X\rightarrow\mathbb{R}$ be a p.d. kernel, and let $S\subset X$ be a countable discrete subset. For all $s\in S$ , set $K_{s}\left(\cdot\right)=K\left(\cdot,s\right)$ . Then TFAE:

(i)

The family $\left\{K_{s}\right\}_{s\in S}$ is a Parseval frame in $\mathscr{H}\left(K\right)$ ; 2. (ii)

[TABLE] 3. (iii)

[TABLE] 4. (iv)

[TABLE]

where the sum converges in the norm of $\mathscr{H}\left(K\right)$ .

Proof.

The proof is simple, and follows the steps in the proof of 7.2. Details are left to the reader. ∎

We now turn to dichotomy: Existence of countably discrete sampling sets vs non-existence.

Example 10.4.

Let $X=\mathbb{R}$ , and let $K:\mathbb{R}\times\mathbb{R}\rightarrow\mathbb{R}$ be the Shannon kernel, where

[TABLE]

We may choose $S=\mathbb{Z}$ , and then $\left\{K\left(\cdot,n\right)\right\}_{n\in\mathbb{Z}}$ is even an orthonormal basis (ONB) in $\mathscr{H}\left(K\right)$ , but there are many other examples of countable discrete subsets $S\subset\mathbb{R}$ such that (10.1) holds for finite $a,b\in\mathbb{R}_{+}$ .

The RKHS $\mathscr{H}\left(K\right)$ in (10.2) is the Hilbert space $\subset L^{2}\left(\mathbb{R}\right)$ consisting of all $f\in L^{2}\left(\mathbb{R}\right)$ such that $suppt(\hat{f})\subset\left[-\pi,\pi\right]$ , where “suppt” stands for support of the Fourier transform $\hat{f}$ . Note $\mathscr{H}\left(K\right)$ consists of functions on $\mathbb{R}$ which have entire analytic extensions to $\mathbb{C}$ . Using the above observations, we get

[TABLE]

Example 10.5.

Let $K$ be the covariant kernel of standard Brownian motion, with $X:=[0,\infty)$ or $[0,1)$ , and

[TABLE]

Theorem 10.6.

Let $K$ , $X$ be as in (10.3); then there is no countable discrete subset $S\subset X$ such that $\left\{K\left(\cdot,s\right)\right\}_{s\in S}$ is dense in $\mathscr{H}\left(K\right)$ .

Proof.

Suppose $S=\left\{x_{n}\right\}$ , where

[TABLE]

then consider the following function

[TABLE]

On the respective intervals $\left[x_{n},x_{n+1}\right]$ , the function $f$ is as follows:

[TABLE]

In particular, $f\left(x_{n}\right)=f\left(x_{n+1}\right)=0$ , and on the midpoints:

[TABLE]

see 10.1.

Choose $\left\{c_{n}\right\}_{n\in\mathbb{N}}$ such that

[TABLE]

Admissible choices for the slope-values $c_{n}$ include

[TABLE]

We will now show that $f\in\mathscr{H}\left(K\right)$ . For the distribution derivative computed from (10.5), we get

[TABLE]

which is the desired conclusion, see (10.5). ∎

Corollary 10.7.

For the kernel $K\left(x,y\right)=x\wedge y$ in (10.3), $X=[0,\infty)$ , the following holds:

Given $\left\{x_{j}\right\}_{j\in\mathbb{N}}\subset\mathbb{R}_{+}$ , $\left\{y_{j}\right\}_{j\in\mathbb{N}}\subset\mathbb{R}$ , then the interpolation problem

[TABLE]

is solvable if

[TABLE]

Proof.

Let $f$ be the piecewise linear spline (see 10.2) for the problem (10.8), see 10.2; then the $\mathscr{H}\left(K\right)$ -norm is as follows:

[TABLE]

when (10.9) holds. ∎

*Remark 10.8**.*

Let $K$ be as in (10.3), $X=[0,\infty)$ . For all $0\leq x_{j}<x_{j+1}<\infty$ , let

[TABLE]

Assuming (10.6) holds, then

[TABLE]

Theorem 10.9.

Let $X$ be a set of cardinality $c$ of the continuum, and let $K:X\times X\rightarrow\mathbb{R}$ be a positive definite kernel. Let $S=\left\{x_{j}\right\}_{j\in\mathbb{N}}$ be a discrete subset of $X$ . Suppose there are weights $\left\{w_{j}\right\}_{j\in\mathbb{N}}$ , $w_{j}\in\mathbb{R}_{+}$ , such that

[TABLE]

for all $f\in\mathscr{H}\left(K\right)$ . Suppose further that there is a point $t_{0}\in X\backslash S$ , a $y_{0}\in\mathbb{R}\backslash\left\{0\right\}$ , and $\alpha\in\mathbb{R}_{+}$ such that the infimum

[TABLE]

is strictly positive.

Then $S$ is not a interpolation set for $\left(K,X\right)$ .

Proof.

This results follows from 10.2 and 10.3 above. We also refer readers to [JT16b]. ∎

*Acknowledgement**.*

The co-authors thank the following colleagues for helpful and enlightening discussions: Professors Daniel Alpay, Sergii Bezuglyi, Ilwoo Cho, Myung-Sin Song, Wayne Polyzou, and members in the Math Physics seminar at The University of Iowa.

Bibliography76

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[AB 97] D. Alpay and V. Bolotnikov, On tangential interpolation in reproducing kernel Hilbert modules and applications , Topics in interpolation theory (Leipzig, 1994), Oper. Theory Adv. Appl., vol. 95, Birkhäuser, Basel, 1997, pp. 37–68. MR 1473250
2[ABK 02] Daniel Alpay, Vladimir Bolotnikov, and H. Turgay Kaptanoğlu, The Schur algorithm and reproducing kernel Hilbert spaces in the ball , Linear Algebra Appl. 342 (2002), 163–186. MR 1873434
3[AD 93] Daniel Alpay and Harry Dym, On a new class of structured reproducing kernel spaces , J. Funct. Anal. 111 (1993), no. 1, 1–28. MR 1200633
4[AD 06] D. Alpay and C. Dubi, Some remarks on the smoothing problem in a reproducing kernel Hilbert space , J. Anal. Appl. 4 (2006), no. 2, 119–132. MR 2223568
5[ADD 90] Daniel Alpay, Patrick Dewilde, and Harry Dym, Lossless inverse scattering and reproducing kernels for upper triangular operators , Extension and interpolation of linear operators and matrix functions, Oper. Theory Adv. Appl., vol. 47, Birkhäuser, Basel, 1990, pp. 61–135. MR 1120274
6[AD Rd S 01] D. Alpay, A. Dijksma, J. Rovnyak, and H. S. V. de Snoo, Realization and factorization in reproducing kernel Pontryagin spaces , Operator theory, system theory and related topics (Beer-Sheva/Rehovot, 1997), Oper. Theory Adv. Appl., vol. 123, Birkhäuser, Basel, 2001, pp. 43–65. MR 1821907
7[AJ 12] Daniel Alpay and Palle E. T. Jorgensen, Stochastic processes induced by singular operators , Numer. Funct. Anal. Optim. 33 (2012), no. 7-9, 708–735. MR 2966130
8[AJL 11] Daniel Alpay, Palle Jorgensen, and David Levanony, A class of Gaussian processes with fractional spectral measures , J. Funct. Anal. 261 (2011), no. 2, 507–541. MR 2793121

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Decomposition of Gaussian processes, and factorization of positive

Abstract.

Key words and phrases:

2000 Mathematics Subject Classification:

Contents

1. Introduction

2. Positive definite kernels

Positive definite kernels and their reproducing kernel Hilbert spaces

Lemma 2.1**.**

Proof.

3. Gaussian processes

Theorem 3.1** (Kolmogorov [KR60], see also [Hid80, Hid92]).**

4. Sigma-finite measure spaces and Gaussian processes

Definition 4.1**.**

Lemma 4.2** (see e.g., [AJL17]).**

Lemma 4.3**.**

Proof.

5. Factorizations and stochastic integrals

Theorem 5.1**.**

Proof.

Lemma 5.2**.**

Proof.

Proof of 5.1 continued.

Remark 5.3*.*

6. Examples and applications

Example 6.1**.**

Example 6.2**.**

Example 6.3**.**

Lemma 6.4**.**

Proof.

Remark 6.5*.*

Example 6.6**.**

Corollary 6.7** (Arveson [Arv98, Coroll 2]).**

Remark 6.8*.*

7. The case of (kx,μ)∈F(K)\left(k_{x},\mu\right)\in\mathscr{F}\left(K\right)(kx​,μ)∈F(K)

Definition 7.1**.**

Lemma 7.2**.**

Proof.

Corollary 7.3**.**

Proof.

8. Point processes: The case when {δx}⊂H(K)\left\{\delta_{x}\right\}\subset\mathscr{H}\left(K\right){δx​}⊂H(K)

8.1. Nets of finite submatrices, and their limits

Lemma 8.1**.**

Proof.

Remark 8.2*.*

Definition 8.3**.**

Lemma 8.4**.**

Proof.

Corollary 8.5**.**

Proof.

Remark 8.6*.*

Corollary 8.7**.**

Proof.

Corollary 8.8**.**

Proof.

8.2. Restrictions of p.d. kernels

Example 8.9** **(**Discretizing the covariance function for Brownian motion on

Remark 8.10*.*

Remark 8.11* (Spectrum of the matrices KFK_{F}KF​; see also [HHT13]).*

Remark 8.12*.*

Definition 8.13**.**

Corollary 8.14**.**

Proof.

Corollary 8.15** (Duality).**

Proof.

Corollary 8.16**.**

Proof.

8.3. Canonical isometries computed from point processes

Theorem 8.17**.**

Proof.

Remark 8.18*.*

Corollary 8.19**.**

Proof.

Lemma 2.1.

Theorem 3.1 (Kolmogorov [KR60], see also [Hid80, Hid92]).

Definition 4.1.

Lemma 4.2 (see e.g., [AJL17]).

Lemma 4.3.

Theorem 5.1.

Lemma 5.2.

*Remark 5.3**.*

Example 6.1.

Example 6.2.

Example 6.3.

Lemma 6.4.

*Remark 6.5**.*

Example 6.6.

Corollary 6.7 (Arveson [Arv98, Coroll 2]).

*Remark 6.8**.*

7. The case of $\left(k_{x},\mu\right)\in\mathscr{F}\left(K\right)$

Definition 7.1.

Lemma 7.2.

Corollary 7.3.

8. Point processes: The case when $\left\{\delta_{x}\right\}\subset\mathscr{H}\left(K\right)$

Lemma 8.1.

*Remark 8.2**.*

Definition 8.3.

Lemma 8.4.

Corollary 8.5.

*Remark 8.6**.*

Corollary 8.7.

Corollary 8.8.

Example 8.9 (**Discretizing the covariance function for Brownian motion on

*Remark 8.10**.*

*Remark 8.11** (Spectrum of the matrices $K_{F}$ ; see also [HHT13]).*

*Remark 8.12**.*

Definition 8.13.

Corollary 8.14.

Corollary 8.15 (Duality).

Corollary 8.16.

Theorem 8.17.

*Remark 8.18**.*

Corollary 8.19.

Corollary 8.20.

*Remark 8.21**.*

Corollary 8.22.

Definition 9.1.

Theorem 9.2.

10. Sampling in $\mathscr{H}\left(K\right)$

Definition 10.1.

Lemma 10.2.

Theorem 10.3.

Example 10.4.

Example 10.5.

Theorem 10.6.

Corollary 10.7.

*Remark 10.8**.*

Theorem 10.9.

*Acknowledgement**.*