On linear weak predictability with single point spectrum degeneracy

Nikolai Dokuchaev

arXiv:1705.02746·cs.IT·January 10, 2020

On linear weak predictability with single point spectrum degeneracy

Nikolai Dokuchaev

PDF

TL;DR

This paper investigates continuous time processes with spectrum degeneracy at a single point, demonstrating their weak linear predictability using universal, time-invariant predictors that are robust to noise.

Contribution

It introduces explicit universal predictors for processes with single point spectrum degeneracy, applicable without spectrum details, and analyzes their robustness.

Findings

01

Predictors are explicitly constructed in the frequency domain.

02

Predictors are universal for the entire class of processes with spectrum degeneracy.

03

Predictors exhibit robustness to noise contamination.

Abstract

The paper studies properties of continuous time processes with spectrum degeneracy at a single point where their Fourier transforms vanish with a certain rate. It appears that these processes are linearly predictable in some weak sense, meaning that convolution integrals over future times can be approximated by causal convolutions over past times. The corresponding predicting kernels are time invariant, and they are presented explicitly in the frequency domain via their transfer functions. These predictors are "universal" meaning that they do not require to know details of the spectrum of the underlying processes; the same predictor can be used for the entire class of processes with a single point spectrum degeneracy. The predictors feature some robustness with respect to noise contamination.

Equations99

\int_{- \infty}^{\infty} \frac{lo g ϕ ( ω )}{1 + ω ^{2}} d ω = - \infty;

\int_{- \infty}^{\infty} \frac{lo g ϕ ( ω )}{1 + ω ^{2}} d ω = - \infty;

(F x) (iω) = \int_{- \infty}^{\infty} e^{- iω t} x (t) d t, ω \in R .

(F x) (iω) = \int_{- \infty}^{\infty} e^{- iω t} x (t) d t, ω \in R .

(L x) (z) = \int_{0}^{\infty} e^{- z t} x (t) d t, z \in C^{+} .

(L x) (z) = \int_{0}^{\infty} e^{- z t} x (t) d t, z \in C^{+} .

K (iω) = \frac{d ( iω )}{δ ( iω )}, ω \in R,

K (iω) = \frac{d ( iω )}{δ ( iω )}, ω \in R,

y (t) = \int_{t}^{+ \infty} κ (t - s) x (s) d s, y (t) = \int_{- \infty}^{t} κ (t - s) x (s) d s .

y (t) = \int_{t}^{+ \infty} κ (t - s) x (s) d s, y (t) = \int_{- \infty}^{t} κ (t - s) x (s) d s .

∥ y - y_{j} ∥_{Y_{p}} \to 0 as j \to + \infty \forall x \in \overset{ˉ}{X},

∥ y - y_{j} ∥_{Y_{p}} \to 0 as j \to + \infty \forall x \in \overset{ˉ}{X},

∥ y - y ∥_{Y_{p}} \leq ε ∥ x ∥_{\overset{ˉ}{X}} \forall x \in \overset{ˉ}{X},

∥ y - y ∥_{Y_{p}} \leq ε ∥ x ∥_{\overset{ˉ}{X}} \forall x \in \overset{ˉ}{X},

X_{m} (iω) = X (iω) I_{D_{m}} (iω), m = 1, 2, X = F x .

X_{m} (iω) = X (iω) I_{D_{m}} (iω), m = 1, 2, X = F x .

h (ω, q, c) = Δ exp \frac{c}{∣ ω ∣ ^{q}} .

h (ω, q, c) = Δ exp \frac{c}{∣ ω ∣ ^{q}} .

∥ x ∥_{X (q, c)} = Δ ess sup_{ω \in R} ∣ X (iω) ∣ h (ω, q, c) < + \infty, where X = F x .

∥ x ∥_{X (q, c)} = Δ ess sup_{ω \in R} ∣ X (iω) ∣ h (ω, q, c) < + \infty, where X = F x .

V_{j} (z) = Δ 1 - exp (- γ \frac{z - a _{j}}{z + γ ^{- r}}), V (z) = Δ j = 1 \prod m V_{j} (z),

V_{j} (z) = Δ 1 - exp (- γ \frac{z - a _{j}}{z + γ ^{- r}}), V (z) = Δ j = 1 \prod m V_{j} (z),

K (z) = Δ V (z) K (z), κ = Δ F^{- 1} (K ∣_{i R}) .

∥ y - y ∥_{L_{\infty} (R)} \leq ε,

∥ y - y ∥_{L_{\infty} (R)} \leq ε,

∥ y - y ∥_{L_{\infty} (R)} \leq J_{0} + J_{η},

∥ y - y ∥_{L_{\infty} (R)} \leq J_{0} + J_{η},

J_{0} = \frac{1}{2 π} ∥ (K (iω) - K (iω)) X_{0} (iω) ∥_{L_{1} (R)}, J_{η} = \frac{1}{2 π} ∥ (K (iω) - K (iω)) N (iω) ∥_{L_{1} (R)} .

J_{0} = \frac{1}{2 π} ∥ (K (iω) - K (iω)) X_{0} (iω) ∥_{L_{1} (R)}, J_{η} = \frac{1}{2 π} ∥ (K (iω) - K (iω)) N (iω) ∥_{L_{1} (R)} .

∥ y - y ∥_{L_{\infty} (R)} \leq ε + ν (ϰ + 1),

∥ y - y ∥_{L_{\infty} (R)} \leq ε + ν (ϰ + 1),

y (s) = \int_{s}^{\infty} e^{λ (s - t)} x (t) d t = \int_{s}^{\infty} e^{- λ (t - s)} x (t) d t = 0 \forall λ > 0.

y (s) = \int_{s}^{\infty} e^{λ (s - t)} x (t) d t = \int_{s}^{\infty} e^{- λ (t - s)} x (t) d t = 0 \forall λ > 0.

2 π ∥ y_{m} - y_{m, j} ∥_{L_{2} (R)}^{2} = \int_{R} I_{D_{m}} (ω) ∣ K (iω) - K_{j} (iω) ∣^{2} ∣ X_{m} (iω) ∣^{2} d ω

2 π ∥ y_{m} - y_{m, j} ∥_{L_{2} (R)}^{2} = \int_{R} I_{D_{m}} (ω) ∣ K (iω) - K_{j} (iω) ∣^{2} ∣ X_{m} (iω) ∣^{2} d ω

2 π m = 1, 2 \sum ∥ y_{m} - y_{m, j} ∥_{L_{2} (R)}^{2} = 2 π m = 1, 2 \sum \int_{R} ∣ K (iω) - K_{j} (iω) ∣ I_{D_{m}} (ω) ∣ X_{m} (iω) ∣^{2} d ω

2 π m = 1, 2 \sum ∥ y_{m} - y_{m, j} ∥_{L_{2} (R)}^{2} = 2 π m = 1, 2 \sum \int_{R} ∣ K (iω) - K_{j} (iω) ∣ I_{D_{m}} (ω) ∣ X_{m} (iω) ∣^{2} d ω

= 2 π \int_{R} ∣ K (iω) - K_{j} (iω) (ω) ∣ X (iω) ∣^{2} d ω

= \int_{R} [K (iω) - K_{j} (iω)] X (iω) \overline{[K (iω) - K_{j} (iω)] X (iω)} d ω .

2 π m = 1, 2 \sum ∥ y_{m} - y_{m, j} ∥_{L_{2} (R)}^{2} = ((K (iω) - K_{j} (iω) X (iω), (K (iω) - K_{j} (iω)) X (iω))_{L_{2} (R)}

2 π m = 1, 2 \sum ∥ y_{m} - y_{m, j} ∥_{L_{2} (R)}^{2} = ((K (iω) - K_{j} (iω) X (iω), (K (iω) - K_{j} (iω)) X (iω))_{L_{2} (R)}

= ∥ K (iω) X (iω) ∥_{L_{2} (R)}^{2} + ∥ K_{j} (iω)) X (iω) ∥_{L_{2} (R)}^{2} - 2 R,

R = Re \int_{R} \overline{K (iω)} \overline{X (iω)} X (iω) K_{j} (iω) d ω = Re \int_{R} \overline{Y (iω)} K_{j} (iω) X (iω) d ω,

R = Re \int_{R} \overline{K (iω)} \overline{X (iω)} X (iω) K_{j} (iω) d ω = Re \int_{R} \overline{Y (iω)} K_{j} (iω) X (iω) d ω,

R = Re (Y (iω), K_{j} (iω) X (iω))_{L_{2} (R)} = Re (K (iω) X (iω), K_{j} (iω) X (iω))_{L_{2} (R)} = 0.

R = Re (Y (iω), K_{j} (iω) X (iω))_{L_{2} (R)} = Re (K (iω) X (iω), K_{j} (iω) X (iω))_{L_{2} (R)} = 0.

2 π m = 1, 2 \sum ∥ y_{m} - y_{m, j} ∥_{L_{2} (R)}^{2}

2 π m = 1, 2 \sum ∥ y_{m} - y_{m, j} ∥_{L_{2} (R)}^{2}

= \int_{R} (∣ K (iω) ∣^{2} - 2 \overset{ˉ}{K} (iω) K (iω) + ∣ K_{j} (iω) ∣^{2}) ∣ X (iω) ∣^{2} d ω . xx

X_{ψ} = Δ {x \in L_{2} (R) : X (iω) = Ψ (iω) Y (iω), X = F x, Y = F y, y \in X (q, 0)} .

X_{ψ} = Δ {x \in L_{2} (R) : X (iω) = Ψ (iω) Y (iω), X = F x, Y = F y, y \in X (q, 0)} .

1 - exp (z) = - k = 1 \sum + \infty \frac{( - 1 ) ^{k} z ^{k}}{k !} .

1 - exp (z) = - k = 1 \sum + \infty \frac{( - 1 ) ^{k} z ^{k}}{k !} .

V_{j} (z) = - k = 1 \sum + \infty \frac{( - 1 ) ^{k} γ ^{k} ( z - a _{j} ) ^{k}}{k ! ( z + γ ^{- r} ) ^{k}} = - (z - a_{j}) k = 1 \sum + \infty \frac{( - 1 ) ^{k} γ ^{k} ( z - a _{j} ) ^{k - 1}}{k ! ( z + γ ^{- r} ) ^{k}} .

V_{j} (z) = - k = 1 \sum + \infty \frac{( - 1 ) ^{k} γ ^{k} ( z - a _{j} ) ^{k}}{k ! ( z + γ ^{- r} ) ^{k}} = - (z - a_{j}) k = 1 \sum + \infty \frac{( - 1 ) ^{k} γ ^{k} ( z - a _{j} ) ^{k - 1}}{k ! ( z + γ ^{- r} ) ^{k}} .

\frac{iω - a _{j}}{iω + γ ^{- r}} = \frac{( - a _{j} + iω ) ( γ ^{- r} - iω )}{ω ^{2} + γ ^{- 2 r}} = \frac{ω ^{2} - a _{j} γ ^{- r}}{ω ^{2} + γ ^{- 2 r}} + i \frac{- a _{j} ω + γ ^{- r} ω}{ω ^{2} + γ ^{- 2 r}} .

\frac{iω - a _{j}}{iω + γ ^{- r}} = \frac{( - a _{j} + iω ) ( γ ^{- r} - iω )}{ω ^{2} + γ ^{- 2 r}} = \frac{ω ^{2} - a _{j} γ ^{- r}}{ω ^{2} + γ ^{- 2 r}} + i \frac{- a _{j} ω + γ ^{- r} ω}{ω ^{2} + γ ^{- 2 r}} .

Re \frac{iω - a _{j}}{iω + γ ^{- r}} = \frac{ω ^{2} - a _{j} γ ^{- r}}{ω ^{2} + γ ^{- 2 r}} \geq \frac{ω ^{2} - Ω ( γ ) ^{2}}{ω ^{2} + γ ^{- 2 r}} > 0, ω \in D_{+} (γ) .

Re \frac{iω - a _{j}}{iω + γ ^{- r}} = \frac{ω ^{2} - a _{j} γ ^{- r}}{ω ^{2} + γ ^{- 2 r}} \geq \frac{ω ^{2} - Ω ( γ ) ^{2}}{ω ^{2} + γ ^{- 2 r}} > 0, ω \in D_{+} (γ) .

∣ V_{j} (iω) - 1∣ = exp (- γ \frac{iω - a _{j}}{iω + γ ^{- r}}) = exp (- γ Re \frac{iω - a _{j}}{iω + γ ^{- r}}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

On linear weak predictability with single point spectrum degeneracy

Nikolai Dokuchaev

(Submitted: May 8, 2017. Revised: January 9, 2020)

Abstract

The paper studies properties of continuous time processes with spectrum degeneracy at a single point where their Fourier transforms vanish with a certain rate. It appears that these processes are linearly predictable in some weak sense, meaning that convolution integrals over future times can be approximated by causal convolutions over past times. The corresponding predicting kernels are time invariant, and they are presented explicitly in the frequency domain via their transfer functions. These predictors are ”universal” meaning that they do not require to know details of the spectrum of the underlying processes; the same predictor can be used for the entire class of processes with a single point spectrum degeneracy. The predictors feature some robustness with respect to noise contamination.

Keywords: Fourier transform, spectrum degeneracy, pathwise setting, linear predictors.

I Introduction

The paper studies properties of continuous time processes with spectrum degeneracy in a pathwise deterministic setting, i.e., without probabilistic assumptions on the ensemble, where an underlying process is deemed to be unique and such that one cannot rely on statistics collected from observations of other similar paths. A decision (a prediction, an estimate, etc.) has to be based on the intrinsic properties of this single observed path.

There are some opportunities for prediction and interpolation of continuous time processes in pathwise setting with certain degeneracy of their spectrum.

•

In the stochastic setting for continuous time stationary Gaussian processes, there exist optimal predictors represented by causal linear integral operators; see the review of these results in [10, 27]. The predictors are optimal in the sense of minimization of the mean square error; their selection is defined by the spectral density $\phi$ of the underlying process. By the Kolmogorov-Krein Theorem, this error can be zero if and only if

[TABLE]

see, e.g., [11], p. 57.

•

The classical Nyquist-Shannon-Kotelnikov interpolation theorem states that a band-limited function can be uniquely recovered without error from a infinite equidistant sampling sequence. The sampling rate must be at least twice the maximum frequency present in the signal (the critical Nyquist rate).

•

Functions are uniquely defined by the samples taken with the rate defined by the measure of the spectrum support only; see [14], p.39.

•

Functions with certain periodicity of the location of gaps in the spectrum and with some restrictions on the measure or on accumulation at infinity of the spectrum gap are uniquely defined by the sparse samples below the Nyquist rate at sampling points deviating slightly from arithmetic progressions [12, 13, 18, 19, 23, 24].

•

Band-limited functions are analytic and are uniquely defined by their values on an arbitrarily small time interval. In particular, band-limited functions are uniquely defined by their past values, i.e. predictable.

•

Functions with exponential decrease of energy on higher frequencies are uniquely defined by their past values. Moreover, there exist linear predictors that do not require to know the spectrum, with the prediction horizon defined by the rate of the energy decrease Dokuchaev [4].

•

Functions with the Fourier transform vanishing on an arbitrarily small interval $(-\Omega,\Omega)$ for some $\Omega>0$ are uniquely defined by their past values. There are linear predictors defined by $\Omega$ only that allow to predict anticausal convolutions involving the future values D08 [3].

The present paper shows that a degeneracy of the Fourier transform for continuous processes at a single point only still ensures some linear extrapolation opportunities for continuous time processes in the pathwise deterministic setting. It shows that processes featuring this degeneracy are linearly predictable is some weak sense, meaning that anti-causal convolution integrals over future time can be approximated by causal convolution integrals over past time (Theorem 1). This result sheds some new light on the impact of spectrum degeneracy on the predictability and extrapolation.

To prove the predictability of the anti-causal convolutions, we obtained a family of new linear predictors represented by causal convolutions (Theorem 2). The predictors are given explicitly in the frequency domain.

The predictors suggested in the paper are not error free; however, the prediction error can be made arbitrarily small, and there is some robustness with respect to the noise contamination. The predictors suggested here are constructed using the approach developed in [3, 4, 5, 6] but are quite different.

We emphasize that this result is not a straightforward rewording linear of extrapolation results known for stochastic Gaussian processes with the spectral densities. One reason for this is that the properties of these stationary processes are quite special and cannot be mechanically transferred to deterministic functions and their spectrums. For example, it appears that the criterion of recoverability of a single value for a discrete time stationary Gaussian process is different than in the pathwise deterministic setting ([8], p.86). Furthermore, the optimal extrapolating operators known for Gaussian stationary processes have to be constructed for a particular shape of the spectral density (see e.g. [10, 26, 17, 25, 16, 27]). On the other hand, unlike the linear predictors known for the Gaussian processes, the predictors introduced below are ”universal” meaning that they do not require to know the shape of the spectrum (i.e. the Fourier transform) of the underlying processes; the same predictor can be used for a large class of different processes.

The paper is organized in the following manner. In Section II, we formulate the definitions and background facts related to the linear weak predictability. In Section III, we formulate the main theorems on predictability and predictors (Theorem 1 and Theorem 2). Section V contains the proofs. In Section IV, we discuss the robustness of the predictors. Finally, in Section VI, we discuss our results.

II Definitions and background

Let ${\mathbb{I}}$ denote the indicator function, ${\bf R}^{+}\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}[0,+\infty)$ , ${\bf C}^{+}\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}\{z\in{\bf C}:\ {\rm Re\,}z>0\}$ , $i=\sqrt{-1}$ .

For complex valued functions $x\in L_{1}({\bf R})$ or $x\in L_{2}({\bf R})$ , we denote by ${\cal F}x$ the function defined on $i{\bf R}$ as the Fourier transform of $x$ :

[TABLE]

If $x\in L_{2}({\bf R})$ , then $X$ is defined as an element of $L_{2}({\bf R})$ (meaning $L_{2}(i{\bf R})$ ).

For $x\in L_{2}({\bf R})$ such that $x(t)=0$ for $t<0$ , we denote by ${\cal L}x$ the Laplace transform

[TABLE]

Let $H^{p}$ be the Hardy space of holomorphic on ${\bf C}^{+}$ functions $Q(z)$ with finite norm $\|Q\|_{H^{p}}=\sup_{s>0}\|Q(s+i\omega)\|_{L_{p}({\bf R})}$ , $p\in[1,+\infty]$ ; see, e.g., [9], Chapter 11.

By the Paley-Wiener Theorem, $X\in H^{2}$ if and only if $X={\cal L}x$ for some $x\in L_{2}({\bf R})$ such that $x(t)=0$ for $t<0$ ; see e.g. Theorem 19.2 in [20], p.372.

The definitions below in this section are similar to the definitions introduced in [3].

Definition 1

Let ${\cal K}$ be the class of functions $\kappa:{\bf R}\to{\bf R}$ such that $\kappa(t)=0$ for $t>0$ and such that, for any $\kappa\in{\cal K}$ , there exists an integer $m>0$ , a set $\{a_{k}\}_{k=1}^{m}\subset(0,+\infty)$ , and a polynomial $d$ such that ${\rm deg\,}d<m$ and $K={\cal F}k$ is represented as

[TABLE]

where $\delta(i\omega)\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}\prod_{j=1}^{m}(i\omega-a_{j})$ .

In particular, the class ${\cal K}$ includes all linear combinations of functions $e^{\lambda t}{\mathbb{I}}_{\{t\leq 0\}}$ , where $\lambda\in(0,+\infty)$ .

Definition 2

Let $\widehat{\cal K}$ be the class of functions $\widehat{\kappa}:{\bf R}\to{\bf R}$ such that $\widehat{k}(t)=0$ for $t<0$ and $K={\cal L}\widehat{\kappa}\in H^{2}\cap H^{\infty}$ .

We will use the notation “ $*$ ” for the convolution in $L_{2}({\bf R})$ .

We are going to study linear predictors for anti-causal convolutions $y=\kappa*x$ with $\kappa\in{\cal K}$ . More precisely, we will study possibility of their approximation by causal convolutions $\widehat{y}=\widehat{\kappa}*x$ with $\widehat{\kappa}\in\widehat{\cal K}$ . By the choice of ${\cal K}$ and $\widehat{\cal K}$ , it follows that

[TABLE]

The corresponding predictors are linear; they are represented by causal time-invariant convolutions and allow frequency representations via transfer functions which is a preferable in electronic engineering, systems and control. This makes them convenient for applications. In particular, this is because the linear time-invariant systems they can be realised via fixed electronic hardware schemes.

For $p\in[1,+\infty]$ , we define linear normed spaces ${\cal Y}_{p}$ of complex valued functions such that ${\cal Y}_{\infty}\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}C({\bf R})$ and ${\cal Y}_{p}\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}L_{p}({\bf R})$ for $p\in[1,+\infty)$ .

Definition 3

Let $p\in[1,+\infty]$ be given. Let $\bar{\cal X}\subset{\cal Y}_{p}$ be a given set of functions $x:{\bf R}\to{\bf R}$ .

(0)

We say that the set $\bar{\cal X}$ is predictable at time $s\in{\bf R}$ if, for any $x_{1},x_{2}\in\bar{\cal X}$ , if $x_{1}(t)=x_{2}(t)$ for a.e. $t<s$ then $x_{1}(t)=x_{2}(t)$ for a.e. $t\in{\bf R}$ .

(ii)

We say that the set $\bar{\cal X}$ is linearly ${\cal Y}_{p}$ -predictable in the weak sense if, for any $\kappa\in{\cal K}$ , there exists a sequence $\{\widehat{\kappa}_{j}\}_{j=1}^{+\infty}=\{\widehat{\kappa}_{j}(\cdot,\bar{\cal X},\kappa)\}_{j=1}^{+\infty}\subset\widehat{\cal K}$ such that

[TABLE]

where $y=\kappa*x$ and $\widehat{y}_{j}=\widehat{\kappa}_{j}*x$ .

(ii)

Let $\bar{\cal X}$ be a set of processes which is also a linear normed space provided with a norm $\|\cdot\|_{\bar{\cal X}}$ . We say that the set $\bar{\cal X}$ is linearly ${\cal Y}_{p}$ -predictable in the weak sense uniformly with respect to the norm $\|\cdot\|_{\bar{\cal X}}$ , if, for any $\kappa\in{\cal K}$ and $\varepsilon>0$ , there exists $\widehat{\kappa}=\widehat{\kappa}(\cdot,\bar{\cal X},\kappa,\|\cdot\|,\varepsilon)\in\widehat{\cal K}$ such that

[TABLE]

where $y=\kappa*x$ and $\widehat{y}=\widehat{\kappa}*x$ .

We call functions $\widehat{\kappa}_{j}$ and $\widehat{\kappa}$ in Definition 3 predicting kernels.

Proposition 1

Let $\bar{\cal X}$ be such as in Definition 3(ii) with $p=2$ . Then $\bar{\cal X}$ is predictable in the sense of Definition 3(i).

The proof of Proposition 1 given below is based on the completeness of the system $\widehat{\cal K}$ in $L_{2}(-\infty,0)$ . In fact, even a smaller set of finite linear combinations of exponents $e^{\lambda_{k}t}{\mathbb{I}}_{t<0}$ , $k=1,...,\infty$ with $\lambda_{k}>0$ such that $\sum_{k}\lambda_{k}(1+\lambda_{k}^{2})^{-1}=+\infty$ is everywhere dense in $L_{2}(-\infty,0)$ ; see e.g. Crum [2], Sedletskij [21]. In theory, this may provide an approximate linear prediction method for the entire paths of processes being predictable in the sense of Definition 3(ii). For example, assume that the path $x|_{t\leq 0}$ is observable. Then, for $t>0$ , a prediction $\widehat{x}(t)$ of $x(t)$ can be approximated as $\widehat{x}(t)\approx\sum_{k}\widehat{\xi}_{k}f_{k}(t)$ , where $\{f_{k}\}_{k=1}^{\infty}$ is an orthonormal basis in $L_{2}(-\infty,0)$ constructed from the sequence $\{e^{\lambda_{k}t}{\mathbb{I}}_{t<0}\}_{k=1}^{\infty}$ by the Gram-Schmidt orthonormalization procedure, and where the values $\widehat{\xi}_{k}$ are the predictions of the integrals $\xi_{k}=\int_{0}^{\infty}x(s)f_{k}(-s)ds$ that can be found under the assumptions of Definition 3(ii). This would be numerically challenging since the predictors have to be constructed for each $f_{k}$ individually. In this paper, we focus on the prediction of single anti-causal convolutions.

The following examples illustrate the difference between different types of predictability in Definition 3.

Example 1

(i)

Any singleton set $\bar{\cal X}$ is predictable at any time in the sense of Definition 3(i). 2. (ii)

Let $a>0$ , and let $x_{0}(t)=e^{-at}{\mathbb{I}}_{\{t\geq 0\}}$ . The singleton set $\{x_{0}\}$ is predictable at any time in the sense of Definition 3(i) but is not linearly predictable in the sense of Definition 3(ii) or Definition 3(iii). 3. (iii)

Let $\Omega>0$ be given, and let $\bar{\cal X}_{\Omega}$ be the set of all band-limited processes $x\in L_{2}({\bf R})$ such that $X(i\omega)=0$ if $|\omega|>\Omega$ , where $X={\cal F}x$ . Then $\bar{\cal X}_{\Omega}$ is linearly predictable in the sense of Definition 3(ii). 4. (iv)

Let $\Omega>0$ be given, and let $\widetilde{\cal X}_{\Omega}$ be the set of all high-frequency processes $x\in L_{2}({\bf R})$ such that $X(i\omega)=0$ if $|\omega|<\Omega$ , where $X={\cal F}x$ . Then $\widetilde{\cal X}_{\Omega}$ is linearly predictable in the sense of Definition 3(ii). 5. (v)

Let $\lambda>0$ , and let $x(t)={\mathbb{I}}_{\{t\geq 0\}}e^{-\lambda t}$ . Let a domain $D_{1}\subset{\bf R}$ be given such that ${\rm mes\,}D_{1}\in(0,+\infty)$ . Let $D_{2}\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}{\bf R}\setminus D_{1}$ . Let $x_{m}={\cal F}^{-1}X_{m}$ , $m=1,2$ , where

[TABLE]

Then the twin set $\{x_{1},x_{2}\}$ is predictable at any time in the sense of Definition 3(i) but is not linearly predictable in the sense of Definition 3(ii) with $p=2$ .

Example 1(v) implies that any larger class containing $\{x_{1},x_{2}\}$ is not linearly predictable in the sense of Definition 3(ii).

It can be noted that processes from $\bar{\cal X}_{\Omega}$ with a interval spectrum gap at zero feature frequent oscillations (see [1]), and yet Example 1(i) states that this set is linearly predictable in the sense of Definition 3(ii).

III The main result

For $q>0$ , $c>0$ , and $\omega\in{\bf R}$ , set

[TABLE]

Let ${\cal X}(q,c)$ be the class of all processes $x\in L_{2}({\bf R})$ such that

[TABLE]

This class includes processes from $x\in L_{2}({\bf R})$ such that their Fourier transforms vanish at $\omega=0$ with the rate defined by $h$ .

We consider ${\cal X}(q,c)$ as a linear normed space with the corresponding norm.

Note that $h(\omega,q,c)\to+\infty$ as $\omega\to 0$ and that (5) holds for processes with spectrum degeneracy such that $X\left(i\omega\right)$ is approaching zero as $\omega\to 0$ with a sufficient rate of decay. In particular, the class ${\cal X}$ includes all band-limited processes $x\in L_{2}({\bf R})$ such that there exists $\bar{\Omega}>0$ such that $X\left(i\omega\right)=0$ for $\omega\notin[-\bar{\Omega},\bar{\Omega}]$ , where $X={\cal F}x$ . However, the spectrum degeneracy for functions from ${\cal X}$ is mild compared with the band-limitiness; in particular, these functions are not necessarily analytic, and their Fourier transform can be non-zero for all $\omega\neq 0$ .

Example 2

(i)

For any $q\geq 1$ and $c>0$ , the class ${\cal X}(q,c)$ is predictable in the sense of Definition 3(i). 2. (ii)

If either $q\in(0,1)$ or $c\leq 0$ , then the class ${\cal X}(q,c)$ is not predictable in the sense of Definition 3(i).

Theorems 1 and 2 below give, for the case where $q>1$ , a constructive method of predicting of future averages of the processes descried via convolutions; for example, the values $\int_{0}^{\infty}e^{-at}x(t)dt$ can be predicted for $a>0$ using the observations $x|_{t<0}$ and these predictors. Moreover, it is shown in Section IV below that this prediction is robust with respect to the noise contamination. These results represent extension of the result [3] on the case of processes with a single point spectrum degeneracy.

Let ${\cal X}\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}\cup_{q>1,c>0}{\cal X}(q,c)$ ,

Theorem 1

Let either $p=2$ or $p=+\infty$ .

(i)

The class ${\cal X}$ is linearly ${\cal Y}_{p}$ -predictable in the weak sense such as described in Definition 3(ii).

(ii)

For any $c_{0}>0$ and $q_{0}>1$ , the class ${\cal X}(q_{0},c_{0})$ is linearly ${\cal Y}_{p}$ -predictable in the weak sense uniformly with respect to the norm $\|\cdot\|_{{\cal X}(q_{0},c_{0})}$ such as described in Definition 3(iii).

The predictability stated in Theorem 1 is equivalent to the existence of certain predicting kernels. The required kernels are presented explicitly in the following theorem.

Theorem 2

Let $\kappa\in{\cal K}$ be given and represented as (3) for some given $m>0$ and $\{a_{j}\}_{j=1}^{m}\subset(0,+\infty)$ . Let $r>2/(q-1)$ be given. For $\gamma>0$ and $z\in{\bf C}^{+}\cup(i{\bf R})$ , set

[TABLE]

Then $\widehat{K}\in H^{\infty}\cap H^{2}$ , and, for any sequence $\gamma=\gamma_{j}\to+\infty$ , the corresponding sequence of kernels $\widehat{\kappa}$ ensures prediction required in Theorem 1 (i)-(ii).

In particular, by the Paley-Wiener Theorem, it follows that $\widehat{\kappa}(t)=0$ for $t<0$ , where $\widehat{\kappa}={\cal F}^{-1}\widehat{K}(i\omega)$ . Also, we have that $\widehat{\kappa}={\cal F}^{-1}\widehat{K}$ is real valued, since $\kappa$ is real valued and $K\left(-i\omega\right)=\overline{K\left(i\omega\right)}$ , $V\left(-i\omega\right)=\overline{V\left(i\omega\right)}$ .

Predicting kernels $\widehat{k}$ in Theorem 2 represent a modification of the construction introduced in [3] for continuous time processes with the spectrum vanishing on an interval.

Remark 1

Since predicting kernels $\widehat{\kappa}$ in Theorem 2 are real valued, it follows that the corresponding processes $\widehat{y}=\widehat{\kappa}*x$ are real valued if $x$ is real valued. This implies that Theorems 1-2 hold with a modification of Definition 3 involving real valued processes $x,y,\widehat{y}_{j}$ , and $\widehat{y}$ .

Any particular predictor described in Theorem 2 is not error-free and ensures predictability in an approximate sense only. However, the error $\varepsilon$ can be done arbitrarily small; this can be achieved by selection of a large enough $\gamma$ .

The predictors in Theorem 2 do not depend on the polynomial $d$ in (3); however, they depend on $m$ and $\{a_{k}\}_{k=1}^{m}$ in (3).

The rate of spectrum vanishing for predictable processes considered in Theorem 1 is characterized by the pairs $(q,c)\in(1,+\infty)\times(0,+\infty)$ . The following proposition shows that the choice of the critical values here is sharp.

IV On robustness of the predictors with respect to noise

contamination

Let us show that the predictors introduced in Theorem 2 and designed for processes from ${\cal X}$ feature some robustness with respect to noise contamination. Suppose that these predictors are applied to a process $x\in L_{2}({\bf R})$ with a small noise contamination such that $x=x_{0}+\eta$ , where $x_{0}\in{\cal X}$ , and where $\eta\in L_{\infty}({\bf R})\cap L_{2}({\bf R})$ represents the noise. Let $X={\cal F}x$ , $X_{0}={\cal F}x_{0}$ , and $N={\cal F}\eta$ . We assume that $X_{0}(i\cdot)\in L_{1}({\bf R})$ and $\|N(i\cdot)\|_{L_{1}({\bf R})}=\nu$ ; we can write this as $X_{0}\in L_{1}(i{\bf R})$ and and that $\|N\|_{L_{1}(i{\bf R})}=\nu$ . . The parameter $\nu\geq 0$ represents the intensity of the noise.

By the assumptions, the predictors are constructed as in Theorem 2 under the hypothesis that $\nu=0$ , i.e. that $\eta=0$ and $x=x_{0}\in{\cal X}$ . By Theorems 1-2, for an arbitrarily small $\varepsilon>0$ , there exists $\gamma$ such that, if the hypothesis that $\nu=0$ is correct, then

[TABLE]

where $y$ and $\widehat{y}$ are such as in Definition 3. Let us estimate the prediction error for the case where $\nu>0$ . We have that

[TABLE]

where

[TABLE]

The value $J_{\eta}$ represents the additional error caused by the presence of unexpected high-frequency noise (when $\nu>0$ ). It follows that

[TABLE]

where $\varkappa\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}\sup_{\omega\in{\bf R}}|\widehat{K}\left(i\omega\right)|$ .

Therefore, it can be concluded that the prediction is robust with respect to noise contamination for any given $\varepsilon$ . On the other hand, if $\varepsilon\to 0$ then $\gamma\to+\infty$ and $\varkappa\to+\infty$ . In this case, error (6) is increasing for any given $\nu>0$ . Therefore, the error in the presence of noise will be large for a predictor targeting too small a size of the error for the noiseless processes from ${\cal X}$ .

The equations describing the dependence of $(\varepsilon,\varkappa)$ on $\gamma$ could be derived similarly to estimates in [5], Section 6, where it was done for different predicting kernels and for band-limited processes. We leave it for future research.

V Proofs

Proof of Proposition 1. Let us prove statement (i) first. It suffices to show that if $s\in{\bf R}$ and $x\in\bar{\cal X}$ are such that $x|_{t\leq s}=0$ , then $x|_{t>s}=0$ .

Suppose that there exists $s\in{\bf R}$ and $x\in\bar{\cal X}$ such that $x|_{t\leq s}=0$ . For $\kappa\in{\cal K}$ , let $y$ , $\widehat{y}_{j}$ , and $\widehat{\kappa}_{j}$ , be such as described in Definition 3 (i). Since $x|_{t\leq s}=0$ , it follows that $\widehat{y}_{j}(s)=0$ for any $j$ and any $s<0$ . On the other hand, (4) holds by the assumption on $\bar{\cal X}$ in statement (i). Hence $y(s)=0$ for any $\kappa\in{\cal K}$ . Furthermore, the class ${\cal K}$ contains functions $\kappa(t)=e^{\lambda t}{\mathbb{I}}_{t\leq 0}$ for all $\lambda>0$ ; it follows for these functions that

[TABLE]

The Müntz-Szász Theorem implies that there exits a set $\{\lambda_{k}\}_{k=1}^{\infty}\subset(0,+\infty)$ such that that the set of finite linear combinations of exponents $e^{-\lambda_{k}t}$ is complete in $L_{2}(s,+\infty)$ , meaning that the set of finite linear combinations of these exponents is everywhere dense in $L_{2}(s,+\infty)$ ; see e.g. Crum [2], Sedletskij [21]. It follows that $x|_{t>s}=0$ . This completes the proof of Proposition 1(i).

Proof for Example 1. The proof for Examples 1(i-ii) is obvious. The proof for Examples 1(ii-iv) is given in [3].

Let us prove Example 1(v). We have that $x_{1}(t)=-x_{2}(t)$ for a.e. $t\leq 0$ , and that the process $x_{1}(t)$ is band-limited and hence continuous.

Suppose that $x_{1}(t)=x_{2}(t)$ for a.e. $t\leq s$ for some $s\in{\bf R}$ . It would imply that $x_{1}(t)=x_{2}(t)=0$ for $t\leq s$ . Thi is impossible since since $x_{1}\neq 0$ and $x_{1}$ is a band-limited process, it follows that $x_{1}$ cannot vanish on an open interval; otherwise, it its unique analytic extension would be zero. Therefore, we have proved that the set $\{x_{1},x_{2}\}$ is predictable at any time in the sense of Definition 3(i),

Let us show that the set $\{x_{1},x_{2}\}$ is not predictable at any time in the sense of Definition 3(ii).

Let $\kappa\in{\cal K}$ be fixed, and let $y_{m}=\kappa*x_{m}$ , $m=1,2$ . Suppose that there exist kernels $\widehat{\kappa}_{j}\in\widehat{\cal K}$ required in Definition 3(i) for ${\cal V}$ . Let $\widehat{y}_{m,j}=\widehat{\kappa}_{j}*x_{m}$ and $\widehat{Y}_{m,j}={\cal F}\widehat{y}_{m,j}$ .

We have that

[TABLE]

for $m=1,2$ . Hence

[TABLE]

Let $C^{-}\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}\{z\in{\bf C}:\ {\rm Re\,}z<0\}$ , and let $H^{2}_{-}$ be the set of functions $F(z)$ defined on ${\bf C}^{-}$ such that $F(-\bar{z})\in H^{2}$ ; the inverse Fourier transforms $f\in L_{2}({\bf R})$ of these functions are such that $f(t)=0$ for $t>0$ .

By (7)-(8), it follows that

[TABLE]

where

[TABLE]

where $Y(z)\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}K(z)X(z)$ .

Assume that $K\in{\cal K}$ be such that $K(z)=d(z)/\delta(z)$ where $\delta(z)$ is a polynomial of order $m>1$ with the roots containing in the set $\{z\in{\bf C}:\ {\rm Re\,}z>0,\ z\neq\lambda\}$ , and where $d(z)=(z+\lambda)d_{0}(z)$ for a non-zero polynomial $d_{0}(z)$ such that $\deg d_{0}<m-1$ . By the choice of $x$ , we have that $X(i\omega)=(\lambda+i\omega)^{-1}$ ; this implies that $Y=KX\in H^{2}_{-}$ . By the orthogonality in $L_{2}(i{\bf R})$ of the traces of functions from Hardy spaces $H^{2}$ and $H_{-}^{2}$ respectively, we obtain that

[TABLE]

By (7)-(8), it follows that

[TABLE]

It follows from (10) that any choice of $\widehat{\kappa}_{j}$ cannot ensure that $\|y_{m}-\widehat{y}_{m,j}\|_{L_{2}({\bf R})}\to 0$ simultaneously for $m=1$ and $m=2$ , which is inconsistent with the supposition that conditions in Definition 3 are satisfied for the set $\{x_{1},x_{2}\}$ . This completes the proof of Example 1. $\Box$

It can be noted that both singletons $\{x_{1}\}$ and $\{x_{2}\}$ defined in Example 1(v) are linearly predictable in the sense of Definition 3(ii) with $p=2$ , and yet the twin set $\{x_{1},x_{2}\}$ is not linearly predictable in this sense.

Proof of Example 2. It is known that if $x\in L_{2}({\bf R})$ and $x|_{t<0}=0$ a.e. then $X={\cal L}x\in H^{2}$ and $\int_{-\infty}^{\infty}\log|X(i\omega)|(1+\omega^{2})^{-1}d\omega>-\infty$ ; see, e.g. Theorems 11.6 and 11.7 from [9]. This implies that, for any $q\geq 1$ and $c>0$ , ${\cal X}(q,c)\cap H^{2}=\{0\}$ . Hence it cannot happen simultainuously that $x=x_{1}-x_{2}\neq 0$ , $x_{1},x_{2}\in{\cal X}(q,c)$ , and $x\in H^{2}$ (i.e. $x(t)=0$ for $t<0$ ) . This implies that the class ${\cal X}(q,c)$ is predictable in the sense of Definition 3(i).

Let us prove Example 2(ii). For any $c\leq 0$ , by the definitions, ${\cal X}(q,c)$ is the class of $x\in L_{2}({\bf R})$ such that $X={\cal F}x\in L_{\infty}(i{\bf R})$ ; obviously, this class is too wide and cannot be predictable in the sense of Definition 3(i-iii). Therefore, it suffices to consider $q\in(0,1)$ and $c>0$ only.

Assume that $q\in(0,1)$ and $c>0$ be given. Consider a filter with the transfer function $\Psi(z)\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}(1+z)^{-2}\widetilde{\Psi}(z)$ , where $\widetilde{\Psi}(z)\in H^{\infty}$ is such that $|\widetilde{\Psi}\left(i\omega\right)|=\exp(-c|\omega|^{-q})$ , $\omega\in{\bf R}$ . Since $q<1$ , we have that $(1+\omega^{2})^{-1}\log|\widetilde{\Psi}\left(i\omega\right)|\in L_{1}({\bf R})$ . Hence such $\widetilde{\Psi}$ exists; see, e.g. Theorem 11.6 in [9], p. 193. By the choice of $\Psi$ , this filter is causal. Let

[TABLE]

Suppose that the class ${\cal X}(q,c)$ is linearly predictable in the sense of Definition 3(ii). By the definitions, ${\cal X}_{\psi}\subseteq{\cal X}(q,c)$ , hence the class ${\cal X}_{\psi}$ should be also linearly predictable in the sense of Definition 3(ii). On the other hand, ${\cal X}_{\psi}$ consists of processes from ${\cal X}(q,0)$ transformed by a causal filter. As was mentioned above, the class ${\cal X}(q,0)$ cannot be linearly predictable. Therefore, the class ${\cal X}_{\psi}$ also is not linearly predictable in the sense of Definition 3. Hence the supposition is incorrect and the class ${\cal X}(q,c)$ cannot be linearly predictable in this sense for $q\in(0,1)$ . This completes the proof of Example 2. $\Box$ .

To proceed further, we need to establish some properties of the function $V$ .

Let $\kappa\in{\cal K}$ and the corresponding set $\{a_{k}\}_{k=1}^{m}\subset(0,+\infty)$ be given. Let $\bar{a}\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}\max_{j=1,...,m}a_{j}$ , $\Omega(\gamma)=\sqrt{\bar{a}\gamma^{-r}}$ , let $D(\gamma)=[-\Omega(\gamma),\Omega(\gamma)]$ , and let $D_{+}(\gamma)\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}{\bf R}\backslash D(\gamma)$ .

Lemma 1

(i)

$V\in H^{\infty}$ * and $\widehat{K}\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}KV\in H^{\infty}\cap H^{2}$ for any $\gamma>0$ .*

(ii)

${\rm Re\,}\left(\frac{i\omega-a_{j}}{i\omega+\alpha}\right)>0$ * and $|V_{j}\left(i\omega\right)-1|<1$ for any $\gamma>0$ , $\omega\in D_{+}(\gamma)$ , and $j\in\{1,...,m\}$ .*

(iii)

$V(i\omega)\to 1$ * as $\gamma\to+\infty$ for all $\omega\in{\bf R}\setminus\{0\}$ .*

(iv)

For any $q>1$ and $c>0$ , there exists $\gamma_{0}>0$ such that $|V\left(i\omega\right)|h(\omega,q,c)^{-1}\leq 1$ for any $\gamma\geq\gamma_{0}$ and $\omega\in D(\gamma)$ .

Proof of Lemma 1. Clearly,

[TABLE]

Hence

[TABLE]

Hence $V\in H^{\infty}$ . It also follows that $\delta(z)^{-1}V(z)\in H^{2}\cap H^{\infty}$ , since each pole at $z=a_{k}$ of $\delta(z)^{-1}$ is being compensated by multiplying on $V_{j}(z)$ . Then statement (i) follows from the Paley-Wiener theorem.

Further, we have for $\omega\in{\bf R}$ and $j=1,...,m$ that

[TABLE]

Hence

[TABLE]

By the definitions, it follows that

[TABLE]

Hence

[TABLE]

This implies statement (ii).

Further, $\gamma^{-r}\to 0$ and $\Omega(\gamma)\to 0$ as $\gamma\to+\infty$ . Hence, by (11), there exists $M>0$ such that, for any $\nu>0$ , there exists $\gamma_{\nu,M}>0$ such that

[TABLE]

This and (12) imply statement (iii).

Let us prove statement (iv). Let $\bar{\gamma}>0$ be selected, and let $\Gamma\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}\sup_{\gamma\geq\bar{\gamma},\ j=1,...,m}\sqrt{\Omega(\gamma)^{2}+a_{j}^{2}}$ . For $j\in\{1,...,m\}$ and $\omega\in D(\gamma)$ , we have that

[TABLE]

and

[TABLE]

By the assumptions on $r$ , we have that $1-r(q-1)/2<0$ . Hence, for any $q>1$ and $c>0$ , there exists $\gamma_{0}>0$ such that for any $\gamma\geq\gamma_{0}$

[TABLE]

By the choice of $h$ , it follows that

[TABLE]

Hence

[TABLE]

This completes the proof of statement (iv) and Lemma 1. $\Box$

Proof of Theorem 1. Theorem 1 follows immediately from Theorem 2 which proof is given below. $\Box$

Proof of Theorem 2. Let $\kappa\in{\cal K}$ be given, $K={\cal F}\kappa$ . Let $\gamma=\gamma_{j}\to+\infty$ , and let $(V,\widehat{K})$ be the corresponding functions.

Let $\kappa={\cal F}^{-1}K$ and $\widehat{\kappa}={\cal F}^{-1}\widehat{K}$ . For $x\in{\cal X}$ , let $X\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}{\cal F}x$ and

[TABLE]

Let $Y\left(i\omega\right)\stackrel{{\scriptstyle{\scriptscriptstyle\Delta}}}{{=}}({\cal F}y)\left(i\omega\right)=K\left(i\omega\right)X\left(i\omega\right)$ . By the definitions, it follows that $\widehat{Y}\left(i\omega\right)=\widehat{K}\left(i\omega\right)X\left(i\omega\right)$ .

Further, let $\rho=2$ if $p=2$ and $\rho=1$ if $p=+\infty$ .

We have that $\|\widehat{Y}\left(i\omega\right)-Y\left(i\omega\right)\|_{L_{\rho}({\bf R})}^{\rho}=I_{1}+I_{2},$ where

[TABLE]

By the assumptions, there exists $c>0$ such that $\|X\left(i\omega\right)h(\omega,q,c)\|_{L_{\infty}({\bf R})}<+\infty$ . Hence

[TABLE]

where

[TABLE]

Clearly, $c_{h}(\gamma)\to 0$ as $\gamma\to+\infty$ . Further, the measure of the set $D(\gamma)$ is $2\sqrt{\bar{a}\alpha}$ . By Lemma 1 (iv),

[TABLE]

as $\gamma\to 0$ for any $\rho\geq 1$ and any $j\in\{1,...,m\}$ . It follows that

[TABLE]

Therefore, $I_{1}\to 0$ as $\gamma\to+\infty$ .

Let us estimate $I_{2}$ . We have that

[TABLE]

where

[TABLE]

Here ${\mathbb{I}}$ denotes the indicator function.

By Lemma 1(iii), ${\mathbb{I}}_{D_{+}(\gamma)}(\omega)|K\left(i\omega\right)(1-V\left(i\omega\right))|^{\rho}\to 0$ a.e. as $\gamma\to+\infty$ . By Lemma 1(ii), $|V_{j}\left(e^{i\omega}\right)-1|\leq 1$ for all $\omega\in D_{+}(\gamma)$ . Hence

[TABLE]

From Lebesgue Dominance Theorem, it follows that $\psi(\gamma)\to 0$ as $\gamma\to+\infty$ . It follows that $I_{1}+I_{2}\to 0$ for any $q>1$ and $c>0$ , $x\in{\cal X}(q,c)$ . By the definition of $\rho$ , we have that $1/\rho+1/p=1$ . Hence $\|\widehat{y}-y\|_{L_{p}({\bf R})}\to 0$ as $\gamma\to+\infty$ for any $x\in{\cal X}$ . It follows that the predicting kernels $\widehat{\kappa}={\cal F}^{-1}\widehat{K}$ are such as required in statement (i) of Theorem 1. This completes the proof of statement (i).

Let us show that these kernels are such as required in statement (ii) of Theorem 1. Let

[TABLE]

We have that

[TABLE]

for any $x\in{\cal X}(q_{0},c_{0})$ . It follows from the proofs above that $\xi(\gamma)\to 0$ as $\gamma\to+\infty$ . Hence (4) holds for the corresponding $y={\cal F}^{-1}Y$ and $\widehat{y}_{j}={\cal F}^{-1}\widehat{Y}_{j}$ . In addition, it follows that the predicting kernels $\widehat{\kappa}={\cal F}^{-1}\widehat{K}$ are such as required in statement (ii) of Theorem 1.

Since $X\left(i\omega\right)\in L_{2}({\bf R})\cap L_{\infty}({\bf R})$ , $K\left(i\omega\right)\in L_{2}({\bf R})\cap L_{\infty}({\bf R})$ and $\widehat{K}\in H^{\infty}\cap H^{2}$ , it follows that $y\in C({\bf R})$ and $\widehat{y}\in C({\bf R})$ . For this $y$ and $\widehat{y}$ , the norms in $L_{\infty}({\bf R})$ are the same as the norms in ${\cal Y}_{\infty}=C({\bf R})$ . This completes the proof of Theorem 2. $\Box$

VI Discussion and future research

The present paper is focused on the impact of spectrum degeneracy at a single point for continuous time processes in pathwise deterministic setting. The paper suggests frequency criteria of a linear predictability of anti-causal convolutions and linear predictors described explicitly in the frequency domain. The predictability is feasible for classes of processes with a single point spectrum degeneracity.

(i)

The family of predictors suggested in Theorem 2 do not depend on the shape of the spectrum of the underlying process. This could be useful for applications. 2. (ii)

The predictors from Theorem 2 are not error-free; however, the error can be made arbitrarily small with a choice of large $\gamma$ . In addition, these predictors feature robustness with respect to noise contamination. If the predictor is targeting too small a size of the error, the norm of the transfer function will be large; this could lead to a larger error caused by the presence of noise. 3. (iii)

There is some similarity with a result obtained in [5] for discrete time processes (sequences): they are predictable if their Z-transforms vanish at a point of the unit circle ${\mathbb{T}}=\{z\in{\bf C}:\ |z|=1\}$ . However, the result [5] was less unexpected since a sequence is band-limited and predictable if its Z-transform vanishes on any arbitrarily small arc on ${\mathbb{T}}$ . 4. (iv)

It is still unclear if the linear predictability is feasible for the class ${\cal X}(1,c)$ with some $c>0$ . 5. (v)

The processes with a interval spectrum gap at zero feature frequent oscillations (sign changes) [1]; it would be interesting to see if the processes from ${\cal X}$ have some similar properties.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Blank and Ulanovskii [2012] Blank N., Ulanovskii A. (2011). Paley–Wiener Functions with a Generalized Spectral Gap. J Fourier Anal Appl 17, 899–915.
2Crum [1956] Crum, M.M. (1956). On the theorems of Müntz and Szász. J. of London Mathematical Society. V. s 1-31, Iss. 4, pp. 433-437
3[3] Dokuchaev, N. (2008). The predictability of band-limited, high-frequency, and mixed processes in the presence of ideal low-pass filters. Journal of Physics A: Mathematical and Theoretical 41 , No 38, 382002. (7pp).
4Dokuchaev [2010] Dokuchaev, N. (2010). Predictability on finite horizon for processes with exponential decrease of energy on higher frequencies. Signal processing 90 (2) (2010) 696–701.
5Dokuchaev [2012 a] Dokuchaev, N. (2012). Predictors for discrete time processes with energy decay on higher frequencies. IEEE Transactions on Signal Processing 60 , No. 11, 6027-6030.
6Dokuchaev [2012 b] Dokuchaev, N. (2012). On predictors for band-limited and high-frequency time series. Signal Processing 92 , iss. 10, 2571-2575.
7Dokuchaev [2016] Dokuchaev, N. (2016). Near-ideal causal smoothing filters for the real sequences. Signal Processing 118 , iss. 1, pp. 285-293.
8Dokuchaev [2017] Dokuchaev, N. (2017). On exact and optimal recovering of missing values for sequences. Signal Processing 135 , 81–86.