Asymptotic normality of element-wise weighted total least squares   estimator in a multivariate errors-in-variables model

Yaroslav Tsaregorodtsev

arXiv:1702.00842·math.ST·March 17, 2017

Asymptotic normality of element-wise weighted total least squares estimator in a multivariate errors-in-variables model

Yaroslav Tsaregorodtsev

PDF

Open Access

TL;DR

This paper investigates the asymptotic normality of a weighted total least squares estimator in a multivariate errors-in-variables model with complex error structures, providing conditions for its distribution as data size grows.

Contribution

It establishes conditions under which the element-wise weighted total least squares estimator is asymptotically normal in a multivariable errors-in-variables model.

Findings

01

Conditions for asymptotic normality are derived.

02

The limiting Gaussian distribution has a nonsingular covariance.

03

The model accounts for row-wise correlated errors and varying error covariances.

Abstract

A multivariable measurement error model $A X \approx B$ is considered. Here $A$ and $B$ are input and output matrices of measurements and $X$ is a rectangular matrix of fixed size to be estimated. The errors in $[A, B]$ are row-wise independent, but within each row the errors may be correlated. Some of the columns are observed without errors and the error covariance matrices may differ from row to row. The total covariance structure of the errors is known up to a scalar factor. The fully weighted total least squares estimator of $X$ is studied. We give conditions for asymptotic normality of the estimator, as the number of rows in $A$ is increasing. We provide that the covariance structure of the limiting Gaussian random matrix is nonsingular.

Equations134

A = A_{0} + \tilde{A}, B = B_{0} + \tilde{B},

A = A_{0} + \tilde{A}, B = B_{0} + \tilde{B},

A_{0} X_{0} = B_{0} .

A_{0} X_{0} = B_{0} .

a_{i} = a_{0 i} + \tilde{a}_{i}, b_{i} = b_{0 i} + \tilde{b}_{i}, b_{0 i} = X_{0}^{⊤} a_{0 i}, i = 1, \dots, m .

a_{i} = a_{0 i} + \tilde{a}_{i}, b_{i} = b_{0 i} + \tilde{b}_{i}, b_{0 i} = X_{0}^{⊤} a_{0 i}, i = 1, \dots, m .

C = [A, B], C_{0} = [A_{0}, B_{0}], \tilde{C} = [\tilde{A}, \tilde{B}], Z_{0} = [X_{0} - I_{d}] .

C = [A, B], C_{0} = [A_{0}, B_{0}], \tilde{C} = [\tilde{A}, \tilde{B}], Z_{0} = [X_{0} - I_{d}] .

C = C_{0} + \tilde{C}, C_{0} Z_{0} = 0.

C = C_{0} + \tilde{C}, C_{0} Z_{0} = 0.

cov (\tilde{c}_{ij}, j \in J) = σ^{2} Σ_{i}, i = 1, 2, \dots,

cov (\tilde{c}_{ij}, j \in J) = σ^{2} Σ_{i}, i = 1, 2, \dots,

Z_{0 J} = (z_{0, j k}, j \in J, k = 1, \dots, d) .

Z_{0 J} = (z_{0, j k}, j \in J, k = 1, \dots, d) .

rank (Z_{0 J}) = d .

rank (Z_{0 J}) = d .

(X \in R^{n \times d}, Δ A, Δ B) min i = 1 \sum m ∣∣ Σ_{i}^{- 1/2} Δ c_{i}^{J} ∣ ∣^{2}

(X \in R^{n \times d}, Δ A, Δ B) min i = 1 \sum m ∣∣ Σ_{i}^{- 1/2} Δ c_{i}^{J} ∣ ∣^{2}

(A - Δ A) X = B - Δ B, Δ c_{i}^{J} = 0, i = 1, \dots, m, j \in / J .

(A - Δ A) X = B - Δ B, Δ c_{i}^{J} = 0, i = 1, \dots, m, j \in / J .

Δ c_{i}^{J} := (Δ c_{ij}, j \in J) \in R^{∣ J ∣} .

Δ c_{i}^{J} := (Δ c_{ij}, j \in J) \in R^{∣ J ∣} .

r ank (Z_{J}) = d

r ank (Z_{J}) = d

(i \geq 1, j \in J) sup E ∣ \tilde{c}_{ij} ∣^{2 r} < \infty.

(i \geq 1, j \in J) sup E ∣ \tilde{c}_{ij} ∣^{2 r} < \infty.

S_{i} := \frac{1}{σ ^{2}} cov (\tilde{c}_{i}), i = 1, 2, \dots

S_{i} := \frac{1}{σ ^{2}} cov (\tilde{c}_{i}), i = 1, 2, \dots

q (c, S; X) = c^{⊤} Z (Z^{⊤} S Z)^{- 1} Z^{⊤} c,

q (c, S; X) = c^{⊤} Z (Z^{⊤} S Z)^{- 1} Z^{⊤} c,

Q (X) = i = 1 \sum m q (c_{i}, S_{i}; X), X \in R^{n \times d}, rank (Z_{J}) = d .

Q (X) = i = 1 \sum m q (c_{i}, S_{i}; X), X \in R^{n \times d}, rank (Z_{J}) = d .

s (a, b, S; X) = \tilde{s} \cdot (Z^{⊤} S Z)^{- 1},

s (a, b, S; X) = \tilde{s} \cdot (Z^{⊤} S Z)^{- 1},

\tilde{s} = \tilde{s} (a, b, S; X) := a c^{⊤} Z - [S_{a}, S_{ab}] Z (Z^{⊤} S Z)^{- 1} Z^{⊤} c c^{⊤} Z .

c = [a b], a \in R^{n \times 1}; S = [S_{a} S_{ba} S_{ab} S_{b}], S_{a} \in R^{n \times n} .

c = [a b], a \in R^{n \times 1}; S = [S_{a} S_{ba} S_{ab} S_{b}], S_{a} \in R^{n \times n} .

i = 1 \sum m s (a_{i}, b_{i}, S_{i}; X) = 0, X \in R^{n \times d}, rank (Z_{J}) = d .

i = 1 \sum m s (a_{i}, b_{i}, S_{i}; X) = 0, X \in R^{n \times d}, rank (Z_{J}) = d .

E_{X_{0}} [s_{X}^{'} (a_{i}, b_{i}, S_{i}; X_{0}) \cdot H] = a_{0 i} a_{0 i}^{⊤} H (Z_{0} S_{i} Z_{0})^{- 1} .

E_{X_{0}} [s_{X}^{'} (a_{i}, b_{i}, S_{i}; X_{0}) \cdot H] = a_{0 i} a_{0 i}^{⊤} H (Z_{0} S_{i} Z_{0})^{- 1} .

\frac{1}{m ^{1 + δ /2}} i = 1 \sum m ∣∣ a_{0 i} ∣ ∣^{2 + δ} \to 0, as m \to \infty.

\frac{1}{m ^{1 + δ /2}} i = 1 \sum m ∣∣ a_{0 i} ∣ ∣^{2 + δ} \to 0, as m \to \infty.

E \tilde{c}_{i p} \tilde{c}_{i q} \tilde{c}_{i r} = 0.

E \tilde{c}_{i p} \tilde{c}_{i q} \tilde{c}_{i r} = 0.

W_{i} = (a_{0 i} \overset{c}{^}_{i}^{⊤}, \tilde{c}_{i} \tilde{c}_{i}^{⊤} - σ^{2} S_{i}) .

W_{i} = (a_{0 i} \overset{c}{^}_{i}^{⊤}, \tilde{c}_{i} \tilde{c}_{i}^{⊤} - σ^{2} S_{i}) .

\frac{1}{m} i = 1 \sum m W_{i} \to d Γ = (Γ_{1}, Γ_{2}), as m \to \infty,

\frac{1}{m} i = 1 \sum m W_{i} \to d Γ = (Γ_{1}, Γ_{2}), as m \to \infty,

m (\hat{X} - X_{0}) \to d V_{A}^{- 1} Γ (X_{0}), as m \to \infty,

m (\hat{X} - X_{0}) \to d V_{A}^{- 1} Γ (X_{0}), as m \to \infty,

Γ (X) := Γ_{1} Z + P_{a} Γ_{2} Z - [S_{a}^{\infty}, S_{ab}^{\infty}] Z (Z^{⊤} S_{\infty} Z)^{- 1} (Z^{⊤} Γ_{2} Z),

S_{\infty} = [S_{a}^{\infty} S_{ba}^{\infty} S_{ab}^{\infty} S_{b}^{\infty}] = i \to \infty lim S_{i}, Z = [X - I_{d}] .

S_{\infty} = [S_{a}^{\infty} S_{ba}^{\infty} S_{ab}^{\infty} S_{b}^{\infty}] = i \to \infty lim S_{i}, Z = [X - I_{d}] .

\overline{a b^{⊤}} = m^{- 1} \cdot i = 1 \sum m a_{i} b_{i}^{⊤}, \overset{ˉ}{S} = m^{- 1} i = 1 \sum m S_{i} .

\overline{a b^{⊤}} = m^{- 1} \cdot i = 1 \sum m a_{i} b_{i}^{⊤}, \overset{ˉ}{S} = m^{- 1} i = 1 \sum m S_{i} .

\hat{Z} = (\hat{X} - I_{d}), \overset{σ}{^}^{2} = \frac{1}{d} tr [(\hat{Z}^{⊤} \overline{c c^{⊤}} \hat{Z}) (\hat{Z}^{⊤} \overset{ˉ}{S} \hat{Z})^{- 1}],

\hat{Z} = (\hat{X} - I_{d}), \overset{σ}{^}^{2} = \frac{1}{d} tr [(\hat{Z}^{⊤} \overline{c c^{⊤}} \hat{Z}) (\hat{Z}^{⊤} \overset{ˉ}{S} \hat{Z})^{- 1}],

\hat{V}_{A} = \overline{a a^{⊤}} - \overset{σ}{^}^{2} \overset{ˉ}{S} .

\overset{σ}{^}^{2} \to P σ^{2}, \hat{V}_{A} \to P V_{A} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical and numerical algorithms · Advanced Statistical Methods and Models · Geochemistry and Geologic Mapping

Full text

Asymptotic normality of element-wise weighted total least squares estimator in a multivariate errors-in-variables model

Ya. V. Tsaregorodtsev

Department of Mathematical Analysis, Faculty of Mechanics and Mathematics, Taras Shevchenko National University of Kyiv, Building 4-e, Akademika Glushkova Avenue, Kyiv, Ukraine, 03127

[email protected]

Abstract.

A multivariable measurement error model $AX\approx B$ is considered. Here $A$ and $B$ are input and output matrices of measurements and $X$ is a rectangular matrix of fixed size to be estimated. The errors in $[A,B]$ are row-wise independent, but within each row the errors may be correlated. Some of the columns are observed without errors and the error covariance matrices may differ from row to row. The total covariance structure of the errors is known up to a scalar factor. The fully weighted total least squares estimator of $X$ is studied. We give conditions for asymptotic normality of the estimator, as the number of rows in $A$ is increasing. We provide that the covariance structure of the limiting Gaussian random matrix is nonsingular.

Key words and phrases:

Asymptotic normality, element-wise weighted total least squares estimator, heteroscedastic errors, multivariate errors-in-variables model

2000 Mathematics Subject Classification:

62E20; 62F12; 62J05; 62H12; 65F20

1. Introduction

We deal with an overdetermined set of linear equations $AX\approx B,$ which is common in linear parameter estimation problems [12]. If both the data matrix $A$ and observation matrix $B$ are contaminated with errors, and all the errors are uncorrelated and have equal variances, the total least squares (TLS) technique is appropriate for solving this set [4], [12]. Under mild conditions, the TLS estimator of $X$ is consistent and asymptotically normal, as the number of rows in $A$ is increasing [3], [7].

In this paper we consider heteroscedastic errors. The errors in $[A,B]$ are row-wise independent, but within each row the errors may be correlated. Some of the columns are observed without errors, and the error covariance matrices may differ from row to row. The total error covariance structure is assumed known up to a scalar factor. For this model, the element-wise weighted total least squares (EW-TLS) estimator is introduced and its consistency is proven in [6]. Concerning the computation of the estimator see [10], [5]. The EW-TLS estimator $\hat{X}$ is applied, e.g., in geodesy [9].

Our goal is to extend the asymptotic normality result of [7] to the EW-TLS estimator. We work under the conditions of Theorem 2, [6] about the consistency of $\hat{X}.$ We use the objective function of the estimator, see formula (22) in [6], and the rules of matrix calculus [2].

The paper is organized as follows. In section 2, we describe the model, introduce main assumptions, refer to the consistency result for $\hat{X}$ and present the objective function and the matrix estimating function. In Section 3, we state the asymptotic normality result and provide a nonsingular covariance structure for a limiting random matrix. In Section 4, we derive consistent estimators for nuisance parameters of the model in order to estimate consistently the asymptotic covariance structure of $\hat{X},$ and Section 5 concludes. The proofs are given in Appendix.

Throughout the paper all vectors are column ones, $\operatorname{\mathsf{E}}$ stands for expectation and acts as an operator on the total product, $\operatorname{\mathbf{cov}}(x)$ denotes the covariance matrix of a random vector $x,$ and for a sequence of random matrices $\{X_{m},m\geq 1\}$ of the same size, notation $X_{m}=O_{p}(1)$ means that the sequence $\{||X_{m}||\}$ is stochastically bounded, and $X_{m}=o_{p}(1)$ means that $||X_{m}||\stackrel{{\scriptstyle\text{\rm P}}}{{\rightarrow}}0.$ $\mathrm{I}_{p}$ denotes the identity matrix of size $p.$

2. Observation model and consistency of the estimator

2.1. The EW-TLS promblem

We deal with the model $AX\approx B.$ Here $A\in\mathbb{R}^{m\times n}$ and $B\in\mathbb{R}^{m\times d}$ are matrices of observations, and the matrix $X\in\mathbb{R}^{n\times d}$ is to be estimated. Assume that

[TABLE]

and that there exists $X_{0}\in\mathbb{R}^{n\times d}$ such that

[TABLE]

Here $A_{0}$ is nonrandom true input matrix, $B_{0}$ is a true output matrix, and $\tilde{A},$ $\tilde{B}$ are error matrices. $X_{0}$ is the true value of the matrix parameter.

It is useful to rewrite the model (2.1) and (2.2) as a classical errors-in-variables (EIV) model [1]. Denote $a_{i}^{\top},$ $a_{0i}^{\top},$ $\tilde{a}_{i}^{\top},$ $b_{i}^{\top},$ $b_{0i}^{\top},$ $\tilde{b}_{i}^{\top},$ $i=1,\dots,m,$ the rows of $A,$ $A_{0},$ $\tilde{A},$ $B,$ $B_{0}$ and $\tilde{B},$ respectively. Then the model above is equivalent to the EIV model

[TABLE]

Vectors $a_{0i}$ are nonrandom and unknown, and vectors $\tilde{a}_{i},$ $\tilde{b}_{i}$ are random errors. Based on observations $a_{i},$ $b_{i},$ $i=1,\dots,m,$ one has to estimate $X_{0}.$

Rewrite the model (2.1) and (2.2) in an implicit way. Introduce matrices

[TABLE]

Then (2.1), (2.2) is equivalent to the next relations:

[TABLE]

Let $\tilde{C}=(\tilde{c}_{ij},i=1,\dots,m,\ \ j=1,\dots,n+d).$ Following [6] we state global assumptions of the paper, conditions (i) to (iv).

(i).

Vectors $\tilde{c}_{i}:=(\tilde{c}_{i1},\dots,\tilde{c}_{i,n+d})^{\top},$ $i=1,2,\dots,$ are independent with zero mean and finite second moments.

Let $\sigma_{ij}^{2}=\operatorname{\mathsf{E}}\tilde{c}_{ij}^{2},$ $i=1,2,\dots,$ $j=1,\dots,n+d.$ We allow that some of $\sigma_{ij}^{2}$ are vanishing.

(ii).

For a fixed $J\subset\{1,2,\dots,n+d\},$ every $j\notin J$ and every $i=1,2,\dots$ satisfy $\sigma_{ij}^{2}=0.$ Moreover

[TABLE]

with unknown positive factor of proportionality $\sigma^{2}$ and known matrices $\Sigma_{i}.$ 2. (iii).

There exists $\varkappa>0$ such that for every $i=1,2,\dots,$ it holds $\lambda_{min}(\Sigma_{i})\geq\varkappa^{2}.$

For the matrix $Z_{0}=(z_{0,jk})$ given in (2.4) and the set $J$ from condition (ii), denote

[TABLE]

(iv).

[TABLE]

The EW-TLS problem consists in finding the value $\hat{X}$ of the unknown matrix $X$ and values of disturbances $\Delta\hat{A},$ $\Delta\hat{B}$ minimizing the weighted sum of squared corrections:

[TABLE]

subject to constrains

[TABLE]

Here $C=[A,B]=(c_{ij}),$ $\Delta C=[\Delta A,\Delta B]=(\Delta c_{ij})$ and the column vectors

[TABLE]

2.2. EW-TLS estimator and its consistency

For a random realization, it can happen that the problem (2.5) has no solution. Assume conditions (i) – (iv).

Definition 1.

The EW-TLS estimator $\hat{X}=\hat{X}_{EW-TLS}$ of $X_{0}$ in the model (2.1), (2.2) is a Borel measurable mapping of the data matrix $C$ into $\mathbb{R}^{n\times d}\cup\{\infty\},$ which solves the problem (2.5) under the additional constraint

[TABLE]

$\left(\text{here }Z=\begin{bmatrix}X\\ -\mathrm{I}_{d}\end{bmatrix}=(z_{jk}),\>Z_{J}:=(z_{jk},j\in J,k=1,\dots,d)\right),$ if there exists a solution, and $\hat{X}=\infty$ otherwise.

The EW-TLS estimator always exists due to [11]. We need more conditions to provide the consistency of $\hat{X}.$

(v).

There exists $r\geq 2$ with $r>d\left(|J|-\dfrac{d+1}{2}\right)$ such that

[TABLE] 2. (vi).

$\dfrac{\lambda_{\mathrm{min}}(A_{0}^{\top}A_{0})}{\sqrt{m}}\to\infty,$ as $m\to\infty.$ 3. (vii).

$\dfrac{\lambda_{\min}^{2}(A_{0}^{\top}A_{0})}{\lambda_{\max}(A_{0}^{\top}A_{0})}\to\infty,$ as $m\to\infty.$

The next result on weak consistency is stated in Theorem 2, [6].

Theorem 2.

Assume conditions (i) to (vii). Then the EW-TLS estimator $\hat{X}$ is finite with probability tending to one, and $\hat{X}$ tends to $X_{0}$ in probability, as $m\to\infty.$

Notice that under a bit stronger assumptions on eigenvalues of $A_{0}^{\top}A_{0},$ the estimator $\hat{X}$ is strongly consistent, see Theorem 3, [6].

2.3. The estimating function

Remember that error vectors $\tilde{c}_{i}$ enter condition (i) and the matrix $Z=Z(X)$ is introduced in Definition 1. Let

[TABLE]

Denote also

[TABLE]

where $c=\begin{bmatrix}a\\ b\end{bmatrix}\in\mathbb{R}^{(n+d)\times 1},$ $S\in\mathbb{R}^{(n+d)\times(n+d)},$ and

[TABLE]

Notice that due to (iv) $|J|\geq d,$ and under constraint (2.6) $Z_{J}$ is of full rank. Then, under conditions (i) – (iii) the matrix $Z^{\top}S_{i}Z$ is nonsingular, $i=1,2,\dots$

The EW-TLS estimator is known to minimize the objective function (2.7), see Theorem 1, [6].

Lemma 3.

Assume conditions (i) to (iv). The EW-TLS estimator $\hat{X}$ is finite if, and only if, there exists an unconditional minimum of the function (2.8), and then $\hat{X}$ is a minimum point of this function.

Introduce an estimating function related to the loss function (2.7):

[TABLE]

Here

[TABLE]

Corollary 4.

Assume conditions (i) – (vii). Then the next two statements hold true.

(a)

With probability tending to one $\hat{X}$ is a solution to the equation

[TABLE] 2. (b)

The function (2.9) is an unbiased estimating function, i.e., for each $i\geq 1,$ $\operatorname{\mathsf{E}}_{X_{0}}s(a_{i},b_{i},S_{i};X_{0})=0.$

For fixed $a,$ $b,$ $S,$ the function (2.9) maps $X$ into $\mathbb{R}^{n\times d}.$ The derivative $s^{\prime}_{X}$ is a linear operator in this space.

Lemma 5.

Under conditions (i) – (vii), for each $H\in\mathbb{R}^{n\times d}$ and $i\geq 1$ it holds

[TABLE]

3. Asymptotic normality of the estimator

Introduce further assumptions.

(viii).

For some $\delta>0,$ $\displaystyle\sup_{(i\geq 1,j\in J)}\operatorname{\mathsf{E}}|\tilde{c}_{ij}|^{4+2\delta}<\infty.$ 2. (ix).

For $\delta$ from the condition (viii),

[TABLE] 3. (x).

$\dfrac{1}{m}A_{0}^{\top}A_{0}\to V_{A},$ as $m\to\infty,$ where $V_{A}$ is a nonsingular matrix.

Notice that condition (x) implies assumptions (vi), (vii).

(xi).

For matrices from condition the (ii), $\Sigma_{i}\to\Sigma_{\infty},$ as $m\to\infty,$ where $\Sigma_{\infty}$ is certain matrix.

Notice that conditions (xi), (iii) imply that $\Sigma_{\infty}$ is nonsingular.

(xii).

If $p,q,r\in J$ (they are not necessarily distinct) and $i\geq 1,$ then

[TABLE] 2. (xiii).

If $p,q,r,u\in J$ (they are not necessarily distinct), then $\displaystyle\frac{1}{m}\sum_{i=1}^{m}\operatorname{\mathsf{E}}\tilde{c}_{ip}\tilde{c}_{iq}\tilde{c}_{ir}\tilde{c}_{in}$ converges to a finite limit $\mu_{4}(p,q,r,u),$ as $m$ tends to infinity.

Introduce a random element in the space of couples of matrices:

[TABLE]

Hereafter $\stackrel{{\scriptstyle\text{\rm d}}}{{\rightarrow}}$ stands for the convergence in distribution.

Lemma 6.

Assume conditions (i), (ii) and (viii) – (xiii). Then

[TABLE]

where $\Gamma$ is a Gaussian centered random element with independent matrix components $\Gamma_{1}$ and $\Gamma_{2}.$

Now, we state the asymptotic normality of the EW-TLS estimator.

Theorem 7.

Assume conditions (i) – (v) and (viii) – (xiii). Then

[TABLE]

where $V_{A}$ enters condition (x), $P_{a}$ is the projector with $P_{a}\begin{bmatrix}a\\ b\end{bmatrix}=a,$ $\Gamma_{1}$ and $\Gamma_{2}$ enter relation (3.2), and

[TABLE]

Moreover the limiting random matrix $X_{\infty}:=V_{A}^{-1}\Gamma(X_{0})$ has a nonsingular covariance structure, i.e., for each nonzero vector $u\in\mathbb{R}^{d\times 1},$ $\operatorname{\mathbf{cov}}(X_{\infty}u)$ is a nonsingular matrix.

4. Construction of confidence region for a linear functional of $X_{0}$

4.1. Estimation of nuisance parameters

Theorem 7 can be applied, e.g., to construct a confidence region for a linear functional of $X_{0}.$ For this purpose one has to estimate consistently a covariance structure of the limiting random matrix $V_{A}^{-1}\Gamma(X_{0}).$ Such a structure, besides of $X_{0},$ depends on nuisance parameters. Some of them can be estimated consistently.

Hereafter bar means average for rows $i=1,\dots,m,$ e.g.,

[TABLE]

Lemma 8.

Assume conditions of Theorem 7. Define

[TABLE]

Then, as $m\to\infty,$

[TABLE]

4.2. Estimation of the asymptotic covariance structure of $X_{0}$

Let $u\in\mathbb{R}^{d\times 1},$ $u\neq 0.$ Theorem 7 implies the convergence

[TABLE]

with nonsingular matrix $S_{u}=\operatorname{\mathbf{cov}}(V_{A}^{-1}\Gamma(X_{0})u).$

We start with the case of normal errors $\tilde{c}_{i},$ $i=1,2,\dots$ Then condition (xii) holds true, and Theorem 7 is applicable. The asymptotic covariance matrix $S_{u}$ is a continuous function $S_{u}=S_{u}(X_{0},V_{A},\sigma^{2},S_{\infty})$ of unknown parameters (here the limiting covariance matrix $S_{\infty}$ could be unknown, though for a given $m,$ matrices $S_{1},\dots,S_{m}$ are assumed known). Due to Theorem 2 and Lemma 8 the matrix

[TABLE]

is a consistent estimator of $S_{u}.$

Now, we do not assume the normality of the errors. Then the exact formula for $S_{u}$ does not allow to estimate it consistently, because the formula involves higher moments of errors which are difficult to estimate consistently. Instead, we use Corollary 4 to construct the so-called sandwich estimator [1] for $S_{u}.$ Denote

[TABLE]

with $\tilde{s}$ introduced in (2.10)

Lemma 9.

Assume conditions of Theorem 7. For $u\in\mathbb{R}^{d\times 1},$ $u\neq 0,$ define

[TABLE]

with $\hat{V}_{A}$ given in (4.2), (4.1). Then $\hat{S}_{u}\stackrel{{\scriptstyle\text{\rm P}}}{{\rightarrow}}S_{u},$ as $m\to\infty.$

Remark. In the case of normal errors, the estimator (4.4) is asymptotically more efficient than the estimator (4.6), cf. the discussion in [1], p. 369.

Given a consistent estimator $\hat{S}_{u}$ of $S_{u},$ we have from (4.3) that

[TABLE]

Based on (4.7), one can construct in a standard way an asymptotic confidence ellipsoid for $X_{0}u.$ Similarly a confidence ellipsoid can be constructed for any finite set of linear combinations of $X_{0}$ entries.

5. Conclusion

We proved the asymptotic normality of the EW-TLS estimator in a multivariate errors-in-variables model $AX\approx B$ with heteroscedastic errors. We assumed the convergence (xi) of the second error moments, vanishing third moments (xiii), and the convergence of averaged fourth moments (xiii). The condition (xii) ensured that the asymptotic covariance structure of $\hat{X}$ is nonsingular. This condition holds true in two cases: (a) all the error vectors $\tilde{c}_{i}$ are symmetrically distributed, or (b) for each $i,$ random variables $\tilde{c}_{ip},$ $p\in J,$ are independent and have vanishing coefficient of asymmetry.

The obtained asymptotic normality result made it possible to construct a confidence ellipsoid for a linear functional of $X_{0}.$ Another plausible application is goodness-of-fit test in the model $AX\approx B$ with heteroscedastic errors (see [7] for such a test in the model with homoscedastic errors).

The author is grateful to Prof. A. Kukush for the problem statement and fruitful discussions.

Appendix

Proof of Corollary 4

(a) The space $\mathbb{R}^{n\times d}$ is endowed with natural inner product $<A,B>=\mathrm{tr}(AB^{\top}).$ The matrix derivative $q_{X}^{\prime}$ of the functional (2.7) is a linear functional on $\mathbb{R}^{n\times d},$ and based on the inner product, this functional can be identified with certain matrix from $\mathbb{R}^{n\times d}.$

Remember that $Z=Z(X)$ is introduced in Definition 1. Using the rules of matrix calculus [2], we have for $H\in\mathbb{R}^{n\times d}:$

[TABLE]

Remember relations (2.11). Collecting similar terms, we obtain:

[TABLE]

and

[TABLE]

Using the inner product in $\mathbb{R}^{n\times d}$ we obtain

[TABLE]

with $\tilde{s}(X)=\tilde{s}(a,b,S;X)$ given in (2.10). Now, Theorem 2 and Lemma 3 imply the statement of Corollary 4 (a).

(b) We set

[TABLE]

where $a_{0}$ is a nonrandom vector and like in (2.3),

[TABLE]

Then

[TABLE]

Therefore, see (2.9),

[TABLE]

The statement (b) of Corollary 4 is proven.

Proof of Lemma 5

The derivative $s_{X}^{\prime}$ of the function (2.9) with respect to $X$ is a linear operator in $\mathbb{R}^{n\times d}.$ Denote $f=f(Z)=Z(Z^{\top}SZ)^{-1}.$ For $H\in\mathbb{R}^{n\times d},$ it holds:

[TABLE]

We set (A.1), use relations

[TABLE]

and get:

[TABLE]

Next,

[TABLE]

Combining (A.2) and (A.3) we see that on the right-hand side of (A.2) summands containing $H^{\top}$ are cancelled out. We get finally

[TABLE]

which implies the statement, because by Corollary 4 (b) it holds $\operatorname{\mathsf{E}}_{X}\tilde{s}(a,b,S;X)=0.$

Proof of Lemma 6

The proof is similar to the proof of Lemmas 6 and 7 from [7] and based on Lyapunov’s Central Limit Theorem. We just notice that due to condition (xii) the matrix components of $W_{i},$ namely $a_{0i}\tilde{c}_{i}^{\top}$ and $\tilde{c}_{i}\tilde{c}_{i}^{\top}-\sigma^{2}S_{i},$ are uncorrelated, and this implies the independence of matrix components $\Gamma_{1}$ and $\Gamma_{2}$ in (3.2).

Proof of Theorem 7

We follow the line of [7], see there the proof of Theorem 8(a). By Corollary 4 (a), it holds with probability tending to 1:

[TABLE]

Denote

[TABLE]

Using Taylor’s formula around $X_{0}$ (see [2], Theorem 5.6.2), we obtain from (A.4) that

[TABLE]

Here $O_{p}(1)$ is a multiplier of the form

[TABLE]

with positive $\varepsilon_{0}$ chosen such that $\mathrm{rank}(Z_{J})=d$ , for all $X$ with $||X-X_{0}||\leq\varepsilon_{0};$ the choice is possible due to condition (iv), and expression (A.6) is indeed $O_{p}(1)$ (i.e., stochastically bounded), because $s_{X}^{\prime\prime}$ is quadratic in $c_{i}$ and the averaged second moments of $c_{i}$ are assumed bounded. Thus, the relation (A.5) holds true due to the consistency of $\hat{X}$ stated in Theorem 2.

We have $||\mathrm{rest}_{1}||\leq||\hat{\Delta}||\cdot o_{p}(1).$ Now, by Lemma 5 and condition (x) and (xi) it holds

[TABLE]

and we derive from (A.5) the relation

[TABLE]

The summands in $y_{m}$ have zero expectation by Corollary 4 (b). Remember that $c_{0i}Z_{0}=0$ and the projector $P_{a}$ is introduced in Theorem 7. Then, see (2.9),

[TABLE]

Here $W_{ij}$ are components of (3.1). By Lemma 6 it holds, see (3.4) and condition (xi):

[TABLE]

Now, relations (A.7), (A.8) and nonsingularity of $V_{A}$ imply $\hat{\Delta}=O_{p}(1)$ and by Slutsky’s lemma

[TABLE]

This implies the desired convergence (3.3) – (3.5).

Let $u\in\mathbb{R}^{d\times 1},$ $u\neq 0.$ By Lemma 6 the components $\Gamma_{1}$ and $\Gamma_{2}$ are independent. We have

[TABLE]

and the latter matrix is positive definite, because $V_{A}$ and $Z_{0}^{\top}S_{\infty}Z_{0}$ are positive definite under the conditions of Theorem 7. Therefore, $\operatorname{\mathbf{cov}}(X_{\infty}u)$ is a positive definite matrix as well.

Proof of Lemma 8

We have

[TABLE]

Relation (A.9) and the convergence $\hat{Z}\stackrel{{\scriptstyle\text{\rm P}}}{{\rightarrow}}Z_{0}$ imply the desired convergence $\hat{\sigma}^{2}\stackrel{{\scriptstyle\text{\rm P}}}{{\rightarrow}}\sigma^{2},$ as $m\to\infty.$

Next,

[TABLE]

Proof of Lemma 9

Denote $\tilde{s}_{i}=\tilde{s}(a_{i},b_{i},S_{i};X_{0}),$ $i=1,2,\dots$ Then expansion (A.7) implies that

[TABLE]

and by Lemma 8

[TABLE]

Then $\hat{S}_{u}-S_{u}\stackrel{{\scriptstyle\text{\rm P}}}{{\rightarrow}}0,$ as $m\to\infty,$ because $\hat{Z}\stackrel{{\scriptstyle\text{\rm P}}}{{\rightarrow}}Z_{0}$ and $\overline{cc^{\top}}=O_{p}(1)$ (see formulas (2.9), (2.10) and (4.5)). Lemma 9 is proven.

Bibliography12

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] R. J. Carroll, D. Ruppert, L. A. Stefanski, and C. M. Crainiceanu, Measurement Error in Nonlinear Models: A Modern Perspective. 2nd ed. Boca Raton, Chapman and Hall/CRC, 2006.
2[2] H. Cartan, Differential Calculus . Hermann/Houghton Mifflin Co., Paris/Boston, MA. Translated from French, 1971.
3[3] L. J. Gleser, Estimation in a multivariate “errors in variables” regression model: large sample results , Ann. Stat. 9 (1981), no. 1, 24–44.
4[4] G. H. Golub and C. F. Van Loan, An analysis of the total least squares problem , SIAM J. Numer. Anal. 17 (1980), no. 6, 883–893.
5[5] S. Jazaerti, A. R. Amiri-Simkooei, and M. A. Sharifi, Iterative algorithm for weighted total least squares adjustment , Survey Review 46 , (2014), no. 334, 19–27.
6[6] A. Kukush and S. Van Huffel, Consistency of elementwise-weighted total least squares estimator in a multivariate errors-in-variables model A X = B 𝐴 𝑋 𝐵 AX=B , Metrika 59 (2004), no. 1, 75–97.
7[7] A. Kukush and Ya. Tsaregorodtsev, Asymptotic normality of total least squares estimator in a multivariable errors-in-variables model A X = B 𝐴 𝑋 𝐵 AX=B , Modern Stochastics: Theory and Applications 3 (2016), no. 1, 47–57.
8[8] A. Kukush and Ya. Tsaregorodtsev, Goodness-of-fit test in a multivariate errors-in-variables model A X = B 𝐴 𝑋 𝐵 AX=B , Modern Stochastics: Theory and Applications 3 (2016), no. 4, 287–302.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Asymptotic normality of element-wise weighted total least squares estimator in a multivariate errors-in-variables model

Abstract.

Key words and phrases:

2000 Mathematics Subject Classification:

1. Introduction

2. Observation model and consistency of the estimator

2.1. The EW-TLS promblem

2.2. EW-TLS estimator and its consistency

Definition 1**.**

Theorem 2**.**

2.3. The estimating function

Lemma 3**.**

Corollary 4**.**

Lemma 5**.**

3. Asymptotic normality of the estimator

Lemma 6**.**

Theorem 7**.**

4. Construction of confidence region for a linear functional of X0X_{0}X0​

4.1. Estimation of nuisance parameters

Lemma 8**.**

4.2. Estimation of the asymptotic covariance structure of X0X_{0}X0​

Lemma 9**.**

5. Conclusion

Appendix

Proof of Corollary 4

Proof of Lemma 5

Proof of Lemma 6

Proof of Theorem 7

Proof of Lemma 8

Proof of Lemma 9

Definition 1.

Theorem 2.

Lemma 3.

Corollary 4.

Lemma 5.

Lemma 6.

Theorem 7.

4. Construction of confidence region for a linear functional of $X_{0}$

Lemma 8.

4.2. Estimation of the asymptotic covariance structure of $X_{0}$

Lemma 9.