Improving Power System State Estimation Based on Matrix-Level Cleaning

Haosen Yang; Robert C. Qiu; Lei Chu; Tiebin Mi; Xin Shi; Chaoyuan Mary; Liu

arXiv:1904.06479·eess.SP·April 7, 2020

Improving Power System State Estimation Based on Matrix-Level Cleaning

Haosen Yang, Robert C. Qiu, Lei Chu, Tiebin Mi, Xin Shi, Chaoyuan Mary, Liu

PDF

Open Access

TL;DR

This paper introduces a novel data-driven, matrix-level cleaning method based on random matrix theory to improve power system state estimation accuracy by effectively reducing measurement errors.

Contribution

It proposes a new eigenvalue-shrinking approach for measurement error cleaning using random matrix theory, enhancing robustness and accuracy in large power grids.

Findings

01

Significantly reduces measurement error impact.

02

Improves accuracy of state estimation in large systems.

03

Demonstrates robustness across various noise models.

Abstract

Power system state estimation is heavily subjected to measurement error, which comes from the noise of measuring instruments, communication noise, and some unclear randomness. Traditional weighted least square (WLS), as the most universal state estimation method, attempts to minimize the residual between measurements and the estimation of measured variables, but it is unable to handle the measurement error. To solve this problem, based on random matrix theory, this paper proposes a data-driven approach to clean measurement error in matrix-level. Our method significantly reduces the negative effect of measurement error, and conducts a two-stage state estimation scheme combined with WLS. In this method, a Hermitian matrix is constructed to establish an invertible relationship between the eigenvalues of measurements and their covariance matrix. Random matrix tools, combined with an…

Tables8

Table 1. TABLE I: The cleaning results of the eigenvalues of 𝐄 𝐄 \bf{E} , 𝐅 𝐙 subscript 𝐅 𝐙 \bf{F_{Z}} and 𝐃 𝐙 subscript 𝐃 𝐙 \bf{D_{Z}} . The error is measured by MAE

Matrices	Measurement error	Estimated error
$𝐄$	1623.3825	102.3944
$𝐅_{𝐙}$	1448.2478	93.0455
$𝐃_{𝐙}$	62.4250	4.0329

Table 2. TABLE II: The MAE of the estimation error of measured variables (Units: p.u.).

Variables	Measurements	WLS	R-WLS
Pt	0.0440	3.5306	0.1206
Pf	0.0469	3.6494	0.1315
Pb	0.0390	2.9567	0.1287
Qt	0.0441	3.6455	0.1360
Qf	0.0473	3.9787	0.1324
Qb	0.0350	2.4066	0.0978
Vm	0.0062	0.0039	0.0012

Table 3. TABLE III: The matrix sizes of different systems.

Systems	Sizes of matrices
European 1354-bus	12015 $\times$ 14418
Polish 3120-bus	24117 $\times$ 28940
French 6468-bus	55389 $\times$ 66467
European 9241-bus	91904 $\times$ 110285

Table 4. TABLE IV: The MAE of cases of different systems and various magnitudes of measurement error (Units: p.u.)

Error of power flows	2.5%		5%		7.5%		10%
Error of Vm	0.5%		1%		1.5%		2%
Methods	WLS	R-WLS	WLS	R-WLS	WLS	R-WLS	WLS	R-WLS
IEEE 30-bus	0.0039	0.0010	0.0082	0.0015	0.0154	0.0015	0.0174	0.0014
IEEE 57-bus	0.0049	0.0010	0.0108	0.0016	0.0166	0.0021	0.0195	0.0022
IEEE 118-bus	0.0052	0.0007	0.0146	0.0013	0.0186	0.0017	0.0174	0.0018
IEEE 300-bus	0.0082	0.0006	0.0184	0.0008	0.0233	0.0014	0.0239	0.0017
European 1354-bus	0.0063	0.0011	0.0183	0.0011	0.0245	0.0014	0.0273	0.0019
Polish 3120-bus	0.0332	0.0013	0.0482	0.0015	0.0547	0.0015	0.0674	0.0018
French 6468-bus	0.0186	0.0010	0.0315	0.0013	0.0424	0.0015	0.0470	0.0018
European 9241-bus	0.0271	0.0005	0.0494	0.0007	0.0631	0.0009	0.0780	0.0013

Table 5. TABLE V: The P.d.f and coefficients of various noise models.

Distributions	P.d.f	Coefficients
Laplace	$f (x) = \frac{1}{2 b} e x p (- \frac{\| x - μ \|}{b})$	$μ = 0$ , $b = σ_{i} / \sqrt{2}$
SC	$f (x) = \frac{2}{π a^{2}} \sqrt{a^{2} - x^{2}}$	$a = 2 σ_{i}$
SL	$y = \frac{1}{a^{2}} (a - \| x \|)$	$a = \sqrt{6} σ_{i}$
NIG	$f (x) = N I G (x)$	see appendix. C
Gaussian	$f (x) = \frac{1}{\sqrt{2 π} b} e x p (- \frac{{(x - μ)}^{2}}{2 b^{2}})$	$μ = 0$ , $b = σ_{i}$

Table 6. TABLE VI: The MAE of the estimation error in the case of different noise distributions (MAE, Units: p.u.).

Distributions	WLS	R-WLS	Inc.Rat
Laplace	0.0246	0.0102	58.5%
SC	0.0427	0.0292	31.6%
SL	0.0280	0.0141	49.7%
NIG	0.0195	0.0032	83.6%
Gaussian	0.0183	0.0011	94.0%

Table 7. TABLE VII: The cleaning effects of divided matrices (MAE, Units: p.u.).

Variables	Size	Measurement	Cleaned
Pt	$1991 \times 2389$	0.0440	0.0117
Pf	$1991 \times 2389$	0.0469	0.0115
Pb	$1354 \times 1625$	0.0390	0.0106
Qt	$1991 \times 2389$	0.0441	0.0102
Qf	$1991 \times 2389$	0.0473	0.0146
Qb	$1354 \times 1625$	0.0350	0.0083
Vm	$1354 \times 1625$	0.0062	0.0016

Table 8. TABLE VIII: The Comparison results of different cleaning methods (MAE, Units: p.u.)

Methods	WLS	RBEC	NLS	CVC	iso-CVC
IEEE 30-bus	0.0082	0.0015	0.0019	0.0074	0.0016
IEEE 57-bus	0.0108	0.0016	0.0019	0.0089	0.0017
IEEE 118-bus	0.0146	0.0013	0.0015	0.0102	0.0012
IEEE 300-bus	0.0184	0.0008	0.0012	0.0098	0.0007

Equations80

z = h (x) + e

z = h (x) + e

J = ∣∣ (z - h (x))^{T} W (z - h (x)) ∣ ∣^{2}

J = ∣∣ (z - h (x))^{T} W (z - h (x)) ∣ ∣^{2}

H^{T} WHΔx = HW (z - h (x))

H^{T} WHΔx = HW (z - h (x))

r = z - h (\hat{x})

r = z - h (\hat{x})

e = z - h (x)

e = z - h (x)

R_{e} = h (\hat{x}) - h (x)

R_{e} = h (\hat{x}) - h (x)

L (W R_{e}) = L (W (h (\hat{x}) - h (x))) = L (W (h (\hat{x}) - z + z - h (x))) \leq L (W (z - h (\hat{x})) + L (W (z - h (x))) = L (W (r + e))

L (W R_{e}) = L (W (h (\hat{x}) - h (x))) = L (W (h (\hat{x}) - z + z - h (x))) \leq L (W (z - h (\hat{x})) + L (W (z - h (x))) = L (W (r + e))

\tilde{z_{i}} = \frac{z _{i} - b _{i}}{σ _{i}} = (z_{i} - b_{i}) w_{i}

\tilde{z_{i}} = \frac{z _{i} - b _{i}}{σ _{i}} = (z_{i} - b_{i}) w_{i}

Z = [\tilde{z}_{k - N + 1}, \tilde{z}_{k - N + 2} ...... \tilde{z}_{k - 1}, \tilde{z}_{k}]

Z = [\tilde{z}_{k - N + 1}, \tilde{z}_{k - N + 2} ...... \tilde{z}_{k - 1}, \tilde{z}_{k}]

H = [h (x)_{k - N + 1}, h (x)_{k - N + 2} ...... h (x)_{k - 1}, h (x)_{k}]

H = [h (x)_{k - N + 1}, h (x)_{k - N + 2} ...... h (x)_{k - 1}, h (x)_{k}]

Z = H + G

Z = H + G

D_{Z} = [Z^{H} Z]

D_{Z} = [Z^{H} Z]

F_{Z} = [Z Z^{H} Z^{H} Z]

F_{Z} = [Z Z^{H} Z^{H} Z]

λ_{D_{Z}} = \pm λ_{F_{Z}}

λ_{D_{Z}} = \pm λ_{F_{Z}}

λ_{F_{Z}} = {λ_{Z Z^{H}} 0 λ_{F_{Z}} \neq = 0 λ_{F_{Z}} = 0

λ_{F_{Z}} = {λ_{Z Z^{H}} 0 λ_{F_{Z}} \neq = 0 λ_{F_{Z}} = 0

D_{H} = [H^{H} H], F_{H} = [H H^{H} H^{H} H]

D_{H} = [H^{H} H], F_{H} = [H H^{H} H^{H} H]

λ_{F_{H}} λ_{D_{H}} = \pm λ_{F_{H}} = {λ_{H H^{H}} 0 λ_{F_{H}} \neq = 0 λ_{F_{H}} = 0

λ_{F_{H}} λ_{D_{H}} = \pm λ_{F_{H}} = {λ_{H H^{H}} 0 λ_{F_{H}} \neq = 0 λ_{F_{H}} = 0

E = (H + G) (H + G)^{H} = C + GG^{H} + HG^{H} + GH^{H} = C + B

E = (H + G) (H + G)^{H} = C + GG^{H} + HG^{H} + GH^{H} = C + B

Γ (E) min = ∣∣Γ (E) - C ∣ ∣^{2} T r [(Γ (E) - C)^{2}]

Γ (E) min = ∣∣Γ (E) - C ∣ ∣^{2} T r [(Γ (E) - C)^{2}]

Γ (E) = i = 1 \sum N ξ_{i} u_{li} u_{ri}^{T}

Γ (E) = i = 1 \sum N ξ_{i} u_{li} u_{ri}^{T}

ξ_{i} = j = 1 \sum N c_{j} (v_{lj} \cdot u_{li})^{2}

ξ_{i} = j = 1 \sum N c_{j} (v_{lj} \cdot u_{li})^{2}

ξ_{i} = j = 1 \sum N c_{j} E [(v_{lj} \cdot u_{li})^{2}]

ξ_{i} = j = 1 \sum N c_{j} E [(v_{lj} \cdot u_{li})^{2}]

E [(v_{lj} \cdot u_{li})^{2}] = \frac{q α ( λ _{i} ) ^{2} ( λ _{i} α ( λ _{i} ) + c _{i} )}{β ( λ _{i} ) ^{2} + γ ( λ _{i} ) ^{2}}

E [(v_{lj} \cdot u_{li})^{2}] = \frac{q α ( λ _{i} ) ^{2} ( λ _{i} α ( λ _{i} ) + c _{i} )}{β ( λ _{i} ) ^{2} + γ ( λ _{i} ) ^{2}}

α (λ_{i}) = (1 - q h_{E} (λ_{i}))^{2} + q^{2} π^{2} ρ_{E}^{2} (λ_{i}) β (λ_{i}) = (λ_{i} α (λ_{i}) + c_{i}) (1 - q) h_{E} (λ_{i}) - α (λ_{i}) (1 - q) γ (λ_{i}) = (λ_{i} α (λ_{i}) + c_{i}) q π ρ_{E} (λ_{i})

α (λ_{i}) = (1 - q h_{E} (λ_{i}))^{2} + q^{2} π^{2} ρ_{E}^{2} (λ_{i}) β (λ_{i}) = (λ_{i} α (λ_{i}) + c_{i}) (1 - q) h_{E} (λ_{i}) - α (λ_{i}) (1 - q) γ (λ_{i}) = (λ_{i} α (λ_{i}) + c_{i}) q π ρ_{E} (λ_{i})

ξ_{i} = (1 - q h_{E} (λ_{i})) (λ_{i} - (1 - q) - 2 q λ_{i} h_{E} (λ_{i})) + q φ (λ_{i})

ξ_{i} = (1 - q h_{E} (λ_{i})) (λ_{i} - (1 - q) - 2 q λ_{i} h_{E} (λ_{i})) + q φ (λ_{i})

φ (λ_{i}) = 1 - h_{E} (λ_{i}) (λ_{i} - (1 - q)) - q λ_{i} (π^{2} ρ_{E}^{2} (λ_{i}) - h_{E}^{2} (λ_{i}))

φ (λ_{i}) = 1 - h_{E} (λ_{i}) (λ_{i} - (1 - q)) - q λ_{i} (π^{2} ρ_{E}^{2} (λ_{i}) - h_{E}^{2} (λ_{i}))

Γ (D_{Z}) = i = 1 \sum N + T ξ_{D i} u_{lDi} u_{rDi}^{T}

Γ (D_{Z}) = i = 1 \sum N + T ξ_{D i} u_{lDi} u_{rDi}^{T}

M A E = \frac{1}{N} i = 1 \sum N ∣ ξ_{i} - λ_{i} ∣

M A E = \frac{1}{N} i = 1 \sum N ∣ ξ_{i} - λ_{i} ∣

λ \sim \frac{1}{2 π λ q σ ^{2}} (b - λ) (λ - a)

λ \sim \frac{1}{2 π λ q σ ^{2}} (b - λ) (λ - a)

G_{E} (s) = (s I_{N} - E)^{- 1}

G_{E} (s) = (s I_{N} - E)^{- 1}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Matrix Theory and Algorithms · Power System Optimization and Stability

Full text

Improving Power System State Estimation Based on Matrix-Level Cleaning

Haosen Yang1, Robert C. Qiu1, , Lei Chu1 , Tiebin Mi1 , Xin Shi1 , Chaoyuan Mary Liu2 1 Department of Electrical Engineering, Center for Big Data and Artificial Intelligence, Shanghai Jiaotong University, Shanghai 200240, China.2 Department of Mathematics and Statistics, Eastern Kentucky University, Richmond, KY 40475, USA.Email: Robert C. Qiu: [email protected]; Haosen Yang: [email protected].

Abstract

Power system state estimation is heavily subjected to measurement error, which comes from the noise of measuring instruments, communication noise, and some unclear randomness. Traditional weighted least square (WLS), as the most universal state estimation method, attempts to minimize the residual between measurements and the estimation of measured variables, but it is unable to handle the measurement error. To solve this problem, based on random matrix theory, this paper proposes a data-driven approach to clean measurement error in matrix-level. Our method significantly reduces the negative effect of measurement error, and conducts a two-stage state estimation scheme combined with WLS. In this method, a Hermitian matrix is constructed to establish an invertible relationship between the eigenvalues of measurements and their covariance matrix. Random matrix tools, combined with an optimization scheme, are used to clean measurement error by shrinking the eigenvalues of the covariance matrix. With great robustness and generality, our approach is particularly suitable for large interconnected power grids. Our method has been numerically evaluated using different testing systems, multiple models of measured noise and matrix size ratios.

Index Terms:

state estimation, two-stage, measurement error, random matrix, Hermitian matrix construction, eigenvalues

I Introduction

Power system state estimation aims to estimate state variables from measurement data corrupted with noise, and it plays an important role in power system operations, such as optimal power flow, stability analysis, and economic dispatch. With the development of energy management systems (EMS) and smart grids, the requirement for accurate operating parameters has been increasing greatly [1].

Conventional state estimation is mainly based on WLS, which solves the normal equation iteratively by Gauss-Newton method [2]. Despite long and wide applications of WLS, there is increasing concern for its accuracy and robustness. The objective function of WLS is the residual between measurements and the estimation of measured variables. Minimizing the residual brings the estimation of measured variables close to measurements. However, approaching measurements is dissimilar from approaching true values, since measurements are corrupted by the noise of measuring instruments, communication noise and random fluctuations. So only minimizing the residual but disregarding measurement error results in a certain degree of estimated error. Besides, WLS heavily suffers from the ill-conditional gain matrix and bad data, and it is sensitive to the initialization of state variables.

In recent years, many researchers have been searching novel approaches to improve power system state estimation [3, 4, 5, 6, 7, 8, 9, 10]. In [6], a least-absolute-value (LAV) guided estimator was proposed, which is more robust and exhibits many advantages for phasor measurements. In [7], an iterative $l_{1}$ - $l_{2}$ mixed convex programming was used for state estimation by linearizing the nonlinear physical equations. In [8], an autoencoder based pre-filtering was proposed to clean measurement noise and remove gross errors. But as a deep learning method, autoencoder spends too much time in off-line training. In [9] and [10], a two-stage state estimation method was designed, in which the measured variables are transformed into a new group of variables at first so that the measurement model in the second stage is linear.

By reviewing, even though many studies investigated new approaches to improve state estimation, few researchers considered directly processing measurement error in any systematic way. Our work aims to fill this gap and improve the accuracy of state estimation. Because of the strong randomness and unclear influence of multiple noise, measurement error is difficult to be handled in vector form or single-value form in the past. Nevertheless, along with the well-established research line of random matrix theory (RMT), some deterministic properties are workable when gathering the fully stochastic measurement error in matrix form. For instance, the M-P law, proposed in [11], reveals that the eigenvalues of a Gaussian covariance matrix asymptotically converge to a deterministic probability distribution. Inspired by this, based on RMT and an optimization scheme, this paper proposes a two-stage state estimation method, in which we process measurements by RMT firstly and then use WLS to estimate state variables. RMT, aiming to extract insightful information from eigenvalues distributions of large covariance matrices, has emerged as a particularly useful framework for many theoretical questions associated with high-dimensional big data analytics [12][13]. It has been successfully applied in quantum physics [14], wireless communication [15], and signal processing [16], for its remarkable effect in dealing with measurement noise in matrix-level. And large amounts of measurement data collected from monitoring systems provide a new opportunity for proper applications of RMT in power systems, including event detection [17] and correlation analysis [18]. In this work, an optimization framework, derived from RMT, is used to clean measurement error by filtering the eigenvalues of the covariance matrix, after forming a Hermitian matrix which is used to establish an invertible relationship between the eigenvalues of measurements and those of their covariance matrix. To my best knowledge, it is the first time for RMT to be applied in power system state estimation.

The contributions of this paper are listed as follows:

(1) A crucial drawback of traditional state estimation methods is clearly analyzed that WLS does not take the effect of measurement error into account. And the association among measurement error, residual and estimated error is quantitatively discussed.

(2) We propose a Hermitian matrix construction method to extend the application scope of previous noise-cleaning methods. Most of previous papers (e.g., [19, 20, 21]), aimed to clean measurement noise based on RMT, only involved eliminating noise of covariance matrices, but they did not intend to clean errors in original data of power systems. So the Hermitian matrix construction is designed to enable the RMT based error-cleaning scheme to adapt to measurements of power systems.

(3) A two-stage state estimation framework is proposed, in which a matrix-level cleaning method is used at first to obtain more reasonable measured values, and then WLS is employed to eventually calculate the state vector. This framework solves what the previous state estimation methods neglect, and greatly improves the accuracy of state estimation.

The rest of this paper is organized as follows, section @slowromancapii@ introduces the problem statement and current drawbacks. Section @slowromancapiii@ talks about the proposed methodology. Section @slowromancapiv@ is case studies, and section @slowromancapv@ summarizes our work.

II Problem Statement

II-A State Estimation

The measurement model of a power system is:

[TABLE]

where $\bf{z}$ denotes the measurement vector. When $\bf{z}$ comes from SCADA, it usually contains power flows, nodal injective power and nodal voltage magnitudes. $\bf{x}$ represents the state vector including nodal voltage magnitudes and voltage angles. $\bf{h}(\cdot)$ denotes the nonlinear function relating $\bf{z}$ to $\bf{x}$ . And $\bf{e}$ is the vector of measurement error whose elements are generally assumed to follow Gaussian distributions. State estimation is usually considered as a typical WLS problem, in which the objective function is:

[TABLE]

where $||\cdot||^{2}$ is the $l_{2}$ norm. ${\bf{W}}=diag\{\sigma_{1}^{2},1/\sigma_{2}^{2},1/\sigma_{3}^{2},\cdot\cdot\cdot 1/\sigma_{n}^{2}\}$ is a diagonal weighted matrix whose elements are reciprocals of the variance of measurement errors. $\sigma_{i}^{2}$ denotes the variance of the Gaussian error for the $i$ - $th$ measured variable.

Function (2) can be minimized by iteratively solving the well-known normal equation:

[TABLE]

where $\bf{H=\partial h(x)/\partial x}$ is the jacobian matrix of $\bf{h(x)}$ w.r.t $\bf{x}$ . Upon convergence, an estimated state vector $\bf{\hat{x}}$ is obtained, and some techniques on gross error detection will be operated.

II-B Issue of Residual

It is clear that we have three vectors of measured variables: measurements $\bf{z}$ , true values $\bf{h(x)}$ and estimated values $\bf{\hat{z}=h(\hat{x})}$ . The residual is defined as the difference vector between measurements $\bf{z}$ and estimated values $\bf{h(\hat{x})}$ :

[TABLE]

The vector of measurement error is the difference between measurements $\bf{z}$ and real values $\bf{h(x)}$ :

[TABLE]

The vector of estimated error is the difference between real values $\bf{h(x)}$ and estimated values $\bf{h(\hat{x})}$ :

[TABLE]

To guide the optimization process, a distance function $L(\cdot)$ is required to measure the scale of these distance vectors, such as $l_{2}$ norm, $l_{1}$ norm, etc. The distance function $L(\cdot)$ must satisfy three requirements: (1) Triangular inequality, i.e., $L(a+b)\leq L(a)+L(b)$ ; (2) Symmetry, $L(-a)=L(a)$ ; (3) $L(\cdot)\geq 0$ . So function (6) can be resolved:

[TABLE]

where $\bf{W}$ is defined similarly as that in function (2). According to function (7), it is obvious that the estimated error $\bf{R_{e}}$ is mainly influenced by two crucial components: the residual $\bf{r}$ and the measurement error $\bf{e}$ . However, WLS fails to consider the influence of $\bf{e}$ , thus the estimation effect is significantly limited by its nature.

To support the above analysis, a quick test using WLS is operated in IEEE 30-bus system, which contains 254 measured variables (including power flows, nodal injective power, and voltage magnitudes) and 60 state variables (nodal voltage magnitudes and angles) in total. In the simulation, Gaussian measurement errors with zero mean are added, whose variance is 5% of original power flows, as well as 1% for original voltage magnitudes. The initial values of voltage magnitudes and angles are randomly sampled from the Gaussian distribution $\mathcal{N}(1,0.05)$ and $\mathcal{N}(0,0.157)$ .

The residual and the estimated error of every state variable are plotted in Fig. 1. Though the residuals of most of variables have been shrunk well, the estimation of partial measured variables is still unsatisfactory, showing that it is extremely unfeasible to ignore the existence of measurement error. To address this issue, we propose a data-driven method to process measurement error.

III Proposed Methodology

Our method to clean measurements in matrix-level involves two main parts: Hermitian matrix construction and RMT based error cleaning (RBEC). The Hermitian matrix construction is responsible for recovering the eigenvalues of the measurement matrix from those of its covariance matrix. And the purpose of RBEC is to clean the covariance matrix through shrinking its eigenvalues.

Assumptions: (1) The topology and parameters of the estimated power system are available, which is a necessary condition for estimating state variables.

(2) The variance of measurement error can be estimated. This assumption, as same as most of studies about power system state estimation, arises from the fact that there are a large number of approaches to estimate the variance of measurement noise by pseudo-measurements (historical data), including direct method [22], statistical inference [23], empirical Bayes estimation [24] and covariance matrix analysis [25][26], etc. For instance, the variance of the individual model $z_{i}=h(x_{i})+e_{i}$ can be written as $\sigma^{2}_{z}=\sigma^{2}_{h}+\sigma^{2}_{e}$ , where $\sigma_{h}$ is the variance of normal fluctuations in operating state, and $\sigma_{e}$ denotes the variance of measurement error. So that the commonest method is performing a large number of independent repeated samplings over a time period during which the operating state does not change much ( $\sigma_{h}\approx 0$ ). As for the bias of measurement error, it can also be estimated easily by calculating the average of the non-absolute error (i.e., $b_{i}=\mathbb{E}(z_{i}-\hat{z}_{i})$ ). Readers can refer to [22] for more information about methods to estimate the variance and bias of measurement error.

(3) The measurement noise is Gaussian distributed. This assumption is based on the central limit theorem that the sum of multiple independent random variables tends to be a Gaussian variable even if the respective variables are not Gaussian distributed. Measurement error is caused by the accumulation of uncertainties of measuring instruments, communication noise and other unclear randomness, so it is generally modeled as a Gaussian distribution. If measurement error is strictly Gaussian distributed, our method can obtain the optimal result. However, in practice, measurement error may do not rigorously follow a Gaussian distribution, because some indescribable factors may slightly change it. This minor violation of the Gaussian hypothesis will reduce the effectiveness of our approach. To empirically illustrate the performance in the case of non-Gaussian distributions, many experiments using different distributions of measurement error are tested in case studies.

III-A Normalization and Matrix Formation

Our method cleans measurement noise in matrix-level, for it is discovered that Gaussian measurement error in matrix form possesses some excellent and analytical properties in the distribution of its eigenvalues. Before forming a matrix, we normalize the measurement error of each measured variable by:

[TABLE]

where $z_{i}$ denotes the $i$ - $th$ variable in the measurement vector, $\sigma_{i}$ represents the standard deviation of its measurement error $e_{i}$ . $b_{i}$ is the bias of $e_{i}$ , and $w_{i}=1/\sigma_{i}^{2}$ is the $i$ - $th$ weighted coefficient in WLS. Then we utilize a split sample window to form an $N\times T(N<T)$ measurement matrix ${\bf{Z}}\in\mathbb{R}^{N\times T}$ :

[TABLE]

where $\bf{\tilde{z}_{k}}$ , by slight abuse of notations, is the normalized current measurement vector. $N$ represents the number of rows (the number of sample variables), specifically each row represents one variable sampled from SCADA. The number of columns $T$ denotes the number of continuous samplings over a period of time.

Assuming the matrix of real values ${\bf{H}}\in\mathbb{R}^{N\times T}$ corresponding to $\bf{Z}$ is:

[TABLE]

where $\bf{\widetilde{h(x)}_{k}}$ is the vector of real values after normalization. Then we have the model:

[TABLE]

where $\bf{G}$ is the normalized Gaussian matrix in which every entry follows an independent identical distribution (i.i.d) with zero mean and unit variance.

III-B Hermitian Matrix Construction

The reason why we construct a Hermitian matrix is to utilize its property: The absolute values of the eigenvalues of a Hermitian matrix are equal to their corresponding singular values. Here the Hermitian matrix ${\bf{D_{Z}}}\in\mathbb{R}^{(N+T)\times(N+T)}$ is constructed by:

[TABLE]

where ${\bf{Z}}^{H}$ is the associate matrix (conjugate transpose) of $\bf{Z}$ . Then the covariance matrix of $\bf{D_{Z}}$ is:

[TABLE]

Besides, another important property of the matrix $\bf{D_{Z}}$ is that if $\lambda_{D}$ is an eigenvalue of $\bf{D_{Z}}$ , its contrary value $-\lambda_{D}$ must be another eigenvalue. According to this and the property of Hermitian matrices introduced above, the relationship between the eigenvalues of $\bf{D_{Z}}$ and $\bf{F_{Z}}$ is:

[TABLE]

where $\lambda_{D_{Z}}$ and $\lambda_{F_{Z}}$ denote the eigenvalues of $\bf{D_{Z}}$ and $\bf{F_{Z}}$ separately. It is easy to find that ${\bf{ZZ}}^{H}$ and ${\bf{Z}}^{H}{\bf{Z}}$ have the same nonzero eigenvalues. And ${\bf{Z}}^{H}{\bf{Z}}$ has other $T-N$ zero eigenvalues when $N<T$ , so that the eigenvalues of $\bf{F_{Z}}$ and ${\bf{ZZ}}^{H}$ are reversibly related:

[TABLE]

Same as $\bf{Z}$ , for the matrix of real values $\bf{H}$ , we have:

[TABLE]

and:

[TABLE]

where $\lambda_{D_{H}}$ , $\lambda_{F_{H}}$ and $\lambda_{HH^{H}}$ denote the eigenvalues of $\bf{D_{H}}$ , $\bf{F_{H}}$ and ${\bf{HH}}^{H}$ , respectively.

The overall computing flow of eigenvalues is shown in Fig. 2, in which we aim to connect the eigenvalues of $\bf{Z}$ and $\bf{H}$ . For now, except for the calculation from ${\bf{ZZ}}^{H}$ to ${\bf{HH}}^{H}$ (the red arrow in Fig. 2), all procedures in this eigenvalues flow are available.

III-C RMT Based Error Cleaning

Now we introduce how to estimate the eigenvalues of ${\bf{HH}}^{H}$ from ${\bf{ZZ}}^{H}$ by RBEC and how to estimate the real values of measured variables. To facilitate reading, some related mathematical equations of RMT are presented in appendix. A.

Now we denote the covariance matrices by ${\bf{E=ZZ}}^{H}$ and ${\bf{C=HH}}^{H}$ . Then model (11) can be expressed in the form of covariance matrices:

[TABLE]

where we denote the last three terms ${\bf{GG}}^{H}+{\bf{HG}}^{H}+{\bf{GH}}^{H}$ by $\bf{B}$ . Let $c_{i}$ , $\bf{v_{li}}$ and $\bf{v_{ri}}$ denote the $i$ - $th$ eigenvalue, left and right eigenvector of $\bf{C}$ , respectively. $\lambda_{i}$ , $\bf{u_{li}}$ and $\bf{u_{ri}}$ denote the $i$ - $th$ eigenvalue, left and right eigenvector of $\bf{E}$ , respectively. $\xi_{i}$ is the $i$ - $th$ estimated eigenvalue corresponding to $\lambda_{i}$ . The objective function of our RBEC is:

[TABLE]

where $\Gamma(\bf{E})$ is the estimation of $\bf{C}$ from $\bf{E}$ . Since $\Gamma(\bf{E})$ and $\bf{C}$ are both symmetric matrices, the $l_{2}$ norm of $\Gamma\bf{(E)-C}$ is equal to the trace of its square, as the second line in function (19) shows. RMT shows the asymptotically deterministic property in the eigenvalues distribution of Gaussian error matrices by using multiple transforms (see appendix. A). So the eigenvalues distributions of matrices in $\bf{B}$ can be analytically connected with $\bf{E}$ and $\bf{C}$ . Thus the trace (equal to the sum of its eigenvalues) of the matrix $\bf{C}$ can be approximated.

To solve this optimization problem, similar to the family of rotational invariant methods [27], we assume that $\Gamma(\bf{E})$ and $\bf{E}$ share the same eigenvectors. Here we have:

[TABLE]

In other words, the eigenvectors are fixed while the eigenvalues are free variables in this optimization. Therefore, it is easy to find the optimal solution:

[TABLE]

where $(\cdot)$ denotes the inner product between two vectors. Then the expectation of $(v_{lj}\cdot u_{li})^{2}$ is considered as a more reasonable estimator:

[TABLE]

Although this framework only involves cleaning the eigenvalues, the eigenvectors are also thoroughly considered in this optimization problem, whose optimal solution (22) can be regarded as converting the task of cleaning the eigenvectors into cleaning the eigenvalues. The $\mathbb{E}[{\bf{(v_{lj}\cdot u_{li}}})^{2}]$ can be analytically expressed in [28]:

[TABLE]

where:

[TABLE]

where $h_{E}(\lambda_{i})$ is the real part of the Stieltjes transform (31), and $\rho_{E}(\lambda_{i})$ is obtained from (32).

Combining (22) and (23), the cleaning function for the eigenvalues is obtained (derivations are shown in appendix. B):

[TABLE]

where:

[TABLE]

By function (26), we can obtain the eigenvalues of the cleaned covariance matrix $\Gamma(\bf{E})$ , then the cleaned eigenvalues of the matrix $\bf{F_{Z}}$ and $\bf{D_{Z}}$ are obtained successively by (17). As same as reconstructing $\bf{C}$ from $\bf{E}$ , we reconstruct $\bf{D_{H}}$ from $\bf{D_{Z}}$ :

[TABLE]

where $\xi_{Di}$ , $\bf{u_{lDi}}$ and $\bf{u_{rDi}}$ denote the estimated $i$ - $th$ eigenvalue of $\bf{D_{H}}$ and the corresponding left and right eigenvector of $\bf{D_{Z}}$ . Finally, the cleaned matrix $\Gamma(\bf{Z})$ is a part of $\Gamma(\bf{D_{Z}})$ , following which we obtain the cleaned measurements $\gamma(\bf{z})$ . The whole calculation flow of RBEC is shown in Fig. 3.

Remark 1: RBEC works in matrix-level, thus measurement noise in every entry of the matrix $\bf{Z}$ can be cleaned jointly. In this paper, historical data is used to form the matrix $\bf{Z}$ , and is cleaned together with current data, so that it is an ensemble processing of measurement data. In this paper, for convenience, current measurements are solely extracted for the next stage.

Remark 2: Even if we reconstruct the Hermitian matrix $\bf{F_{Z}}$ , we clean the eigenvalues of $\bf{E}$ , because the R-transform of $\bf{F_{Z}-F_{H}}$ is not analytical.

III-D Two-stage State Estimation

After RBEC, the cleaned measurement vector $\gamma(\bf{z})$ is input to WLS to estimate state variables. So it is a two-stage state estimation method, named RBEC-WLS (R-WLS), in which RBEC and WLS operate successively. The entire process of our method is shown in Algorithm. 1.

III-E Boundary Condition

Now we list the boundary condition of our matrix-level cleaning method. The first condition is that the size of $\bf{Z}$ is required to follow the Kolmogorov limit: (1) $N\rightarrow\infty$ and $T\rightarrow\infty$ : the number of rows $N$ and columns $T$ should be sufficiently large. (2) $N\sim O(T)(N<T)$ : $N$ is required to be comparable to $T$ [29]. If $N>T$ or $T>>N$ , the effect of our method will be unsatisfactory, which will be clearly discussed in case studies.

The second boundary condition is that the variance of measurement error can be estimated. Since our method is based on the determinacy of the eigenvalues of large-dimensional Gaussian random matrices, measurement errors with different variance need to be normalized before operation.

Also, other common settings of static state estimation are required, such as the information of topology and system parameters, Gaussian assumption of measurement error and so on.

III-F More Discussions

The underlying mechanism inside our method is the well-known M-P law (see appendix. A). The M-P law reveals that the eigenvalues distribution of a Gaussian covariance matrix ${\bf{GG}}^{H}/T$ converges asymptotically to a deterministic probability distribution. The Stieltjes transform, R-transform and S-transform (see appendix. A) are ways by which the eigenvalues can be readily analyzed in a theoretical way. Specifically, the S-transform (35) allows us to approximate the theoretical eigenvalues distribution of matrices product, such as ${\bf{GG}}^{H}$ , ${\bf{HG}}^{H}$ and ${\bf{GH}}^{H}$ . And the R-transform (33) allows us to analytically compute the eigenvalues distribution of the sum of these matrices. So by these transforms, the eigenvalues of the population matrix $\bf{C}$ are connected with those of $\bf{E}$ , and the overlaps of arbitrary eigenvectors of $\bf{E}$ and $\bf{C}$ are completely computable [28]. Then an optimization scheme is designed to convert the task of cleaning the whole covariance matrix into cleaning its eigenvalues. The solution to this optimization problem merely involves the eigenvalues and the overlaps of its eigenvectors, which are completely available by these transforms, then the cleaning equation (25) is obtained. In addition, this scheme tends to focus on cleaning the covariance matrix, so we propose a Hermitian matrix construction approach to enable the optimization scheme to clean the original measurements of power systems.

IV Case Study

In this section, the accuracy and effectiveness of the proposed approach are verified explicitly. In the first case in European 1354-bus high voltage transmission system, the details of the calculation process are elaborated. In the second case, we test our approach using multiple testing systems and various magnitudes of measurement error. The third case shows the performance of our method in the case of different size ratios $q=T/N$ of the measurement matrix $\bf{Z}$ . The fourth case is one in which the measured noise is modeled into non-Gaussian distributions. And the fifth case aims to demonstrate that our method is still valid when we need to divide the sample variables into matrices of appropriate size, in order to overcome the problem of too many measured variables. In the sixth case, other noise-cleaning methods are compared with our approach. The final case demonstrates that our method has little relevance to the operating state of power systems.

IV-A Detailed Process

This case is tested in European 1354-bus high voltage transmission system, which has 1354 bus, 1991 branches and 260 generators. In the simulation, Gaussian measurement errors with randomly selected bias $b_{i}\sim\mathcal{U}(-0.03,0.03)$ are added, whose variance is 5% of original values for power flows, as well as 1% for voltage magnitudes. The initial values of voltage magnitudes for the iteration of WLS are randomly sampled from the Gaussian distribution $\mathcal{N}(1,0.05)$ , while the initial values of voltage angles are chosen from $\mathcal{N}(0,0.157)$ . The results are the average of ten identical experiments. The measurements from SCADA include branch active and reactive power from bus (1991+1991 variables), branch active and reactive power to bus (1991+1991 variables), nodal injective active and reactive power (1354+1354 variables) as well as nodal voltage magnitudes (1354 variables). Thus the size of the measurement vector is $12015\times 1$ , then we choose $1.2*12015\approx 14418$ recent historical measurement vectors to construct a $12015\times 14418$ matrix.

Using Algorithm. 1, the eigenvalues of $\bf{D_{Z}}$ , $\bf{F_{Z}}$ , $\bf{E}$ and the eigenvectors of $\bf{D_{Z}}$ can be calculated. Then we clean the eigenvalues of $\bf{E}$ by function (25), followed by obtaining the cleaned eigenvalues of $\bf{F_{Z}}$ and $\bf{D_{Z}}$ .

The cleaned results of the eigenvalues of $\bf{E}$ , $\bf{F_{Z}}$ and $\bf{D_{Z}}$ are shown in Fig. 4(a), Fig. 4(b) and Fig. 4(c) respectively. References [30] and [31] have demonstrated empirically that the measurement matrix in power systems is low-rank, which means that only a few eigenvalues (usually one or two) are far greater than zero, as the red lines reveal in Fig. $4\sim 6$ . The blue dashed lines, denoting the eigenvalues of the measurement matrix in Fig. $4\sim 6$ , are smoother curves (i.e., more eigenvalues are much greater than zero). Therefore, measurement noise breaks the low-rank property of the monitoring data matrix, yielding a lot of disturbing eigenvalues (components). And greatly reducing these disturbing eigenvalues in the absence of any knowledge of true values is the main task of RBEC, as the green lines reveal in Fig. $4\sim 6$ . The explicit results of the cleaned eigenvalues are listed in Table. I. The mean absolute error (MAE) is used to measure the difference between two vectors:

[TABLE]

where $\xi_{i}$ and $\lambda_{i}$ are the $i$ - $th$ element in two vectors. The reason why the root mean square error (RMSE) is not selected is that the RMSE puts too much emphasis on big outliers. Hence if there are one or two values whose estimated errors are very large, squaring them will cause a very large RMSE, thus it cannot properly reflect the overall difference between two vectors. Furthermore, the mean absolute percent error (MAPE) is also dropped because it is severely influenced by small values. If there are one or two very small real values, the division w.r.t. them will result in a large MAPE even though the difference between real values and the estimation is not great.

As shown in Table. I, the errors of the eigenvalues of $\bf{E}$ , $\bf{F_{Z}}$ and $\bf{D_{Z}}$ are greatly reduced. Then we construct the matrix $\Gamma(\bf{D_{Z}})$ by function (27), followed by obtaining $\Gamma(\bf{Z})$ and $\gamma(\bf{z})$ . Compared with the measurement vector $\bf{z}$ with the MAE of $0.0395$ , the MAE of the cleaned vector $\gamma(\bf{z})$ is only $0.0082$ left, proving that our RBEC is effective to clean measurement error.

The MAE of the state vector decreases from $0.0183$ to $0.0011$ , demonstrating that our method significantly improves the classic WLS based state estimation.

Then the estimation of measured variables is calculated by $\bf{\hat{z}=h(\hat{x})}$ . The estimation error of the measurement vector is shown in Table. II, where Pt, Pf, Pb, Qt, Qf, Qb, Vm denote active power to bus, active power from bus, nodal injective active power, reactive power to bus, reactive power from bus, nodal injective reactive power and voltage magnitudes, respectively. We must express the point that the estimation of measured variables $\bf{\hat{z}}$ is not the cleaned measurement vector $\gamma(\bf{z})$ which is obtained after WLS. Although the MAE of $\gamma(\bf{z})$ is smaller than $\bf{\hat{z}}$ in this case, $\bf{\hat{z}}$ is a more reasonable estimation result, for $\bf{\hat{z}}$ is more in line with the physical equations of the testing system.

IV-A1 Problem of Residual

Our original intention of designing RBEC is to handle the problem of WLS that the residual is only a part of estimated error. As shown in Fig. 1, partial estimated errors are also significant when most of residuals are shrunk to small values. To prove that our R-WLS is able to overcome this problem, we employ the same simulation settings with the quick fact in IEEE 30-bus system. As shown in Fig. 5, both residuals and estimation errors become very small.

IV-B Numerous Tests

The above case clarifies the detailed process of our approach. Now, in this case, numerous tests in different power systems are conducted using different magnitudes of measurement error. The size ratio $q$ is set as $1.2$ in all tests. The initial values of voltage magnitudes are randomly sampled from the Gaussian distribution $\mathcal{N}(1,0.05)$ , while the initial values of voltage angles are chosen from $\mathcal{N}(0,0.157)$ . The IEEE 30-bus, 57-bus, 118-bus, 300-bus systems, European 1354-bus high voltage transmission system, Polish 3120-bus system at summer 2008 morning peak, French 6468-bus very high voltage and high voltage transmission network, and European 9241-bus system are utilized to test our method. The explicit parameters of these systems are referred to [32]. In terms of measurement errors, 2.5%, 5%, 7.5%, 10% error for power flows and 0.5%, 1%, 1.5%, 2% for voltage magnitudes are set. The sizes of the matrices are listed in Table. III, where the number of rows corresponds to the number of sample variables, and the number of columns represents the length of the split window. The explicit results are shown in Table. IV.

According to Table. IV, two important properties of our method are clearly revealed. At first, with the increase of system scale, the performance of our approach improves, while the performance of traditional WLS becomes worse by contrast. This can be explained by that some equations of RMT on which our method based asymptotically hold with the matrix size increasing to infinity, such as the inverse stieltjes transform (32) and the R-transform of Gaussian covariance matrices. RMT derives these equations by assuming that the matrix size converges to the Kolmogorov limit. So that a larger matrix would lead to a better result which is closer to the theoretical conclusion. Therefore, our approach is particularly suitable for large interconnected systems.

Secondly, the results are getting worse as a response to the increase of measurement error. Obviously, the greater error will make it more difficult for WLS to find the optimal solution, as well as for our RBEC to clean measurements. The numerous tests, using different systems and various magnitudes of measurement error, demonstrate the effectiveness and generality of our method.

IV-C Case with Different Size Ratios

As introduced above, the size of the measurement matrix $Z$ is required that the number of rows $N$ should be comparable to the number of columns $T$ , i.e., $N\sim O(T)(N<T)$ or $q=T/N>1$ . So it is very meaningful to test our approach of different $q$ values, if or if not the requirements of size are met. In this simulation, $q$ is set to range from 0.3 to 50, and other settings are the same as the first case. The results are shown in Fig. 6.

According to Fig. 6, if $T>>N$ , the effect of our RBEC becomes unpleasant, since many analytical equations of RMT, such as the M-P Law (29) and the inverse Stieltjes transform (32), do not hold. Refer to appendix. D. for more information about what role this condition plays. Additionally, when $N<T$ , the effect is also unacceptable because the $N$ -dimensional noise space of measurement error cannot be reconstituted. Specifically, reconsidering model (11) and (18), the noise space is $N$ -dimensional since the stochastic measurement error of every sensor or substation (every row) is independent. However, the number of non-zero eigenvalues is $\min\{T,N\}$ . So if $T<N$ , RBEC can only span a $T$ -dimensional subspace instead of the entire noise space, which is not conducive to our approach.

Roughly speaking, the results are satisfactory and keep steady when $q$ ranges approximately from 1 to 8. And the larger the matrix $\bf{Z}$ , the more computing resources our approach will consume. Therefore, the most reasonable value of $q$ is little greater than 1.

IV-D Case with Different Noise Models

This subsection discusses the effectiveness of our method when measurement error is not Gaussian distributed. The mathematical derivation of our method is based on the assumption that measurement error follows a Gaussian distribution, for Gaussian noise is the commonest model of unknown measurement error [22]. However, some unclear randomness may violate this assumption, leading to non-Gaussian noise. So it is necessary to discuss the performance of our approach in the case of non-Gaussian distributions of measurement error.

In this case, the Laplace distribution, semi-circle distribution (SC), symmetric-linear error (SL) and normal-inverse Gaussian distribution (NIG) are simulated, for they are somewhat similar to noise models. The probability density functions (p.d.f.) and coefficients are listed in Table. V. The settings of coefficients aim to make the variance of these distributions equal to $\sigma_{i}$ . The Monte Carlo method is used to generate samplings of these distributions. This case uses European 1354-bus system, and the averaged results of five tests are shown in Table. VI. The Inc.Rat in Table. VI means the increasing ratio of MAE, i.e., $Inc.Rat=(MAE_{W}-MAE_{R})/MAE_{W}$ .

The results agree with our analysis that in the case of Gaussian noise, the promotion of traditional WLS is the highest. And then, though the MAE of WLS in the case of NIG distribution $0.0134$ is smaller than the Gaussian case $0.0164$ , the MAE of R-WLS in the NIG case $0.0026$ is not as small as the Gaussian case $0.0013$ , which reveals that the cleaning effect of NIG noise is not as high as the case of Gaussian error. Furthermore, in all cases, R-WLS successfully improves the performance of WLS, showing that our approach has certain generality and can be effective in practice.

IV-E Case with Divided Matrices

According to the boundary condition introduced in Section @slowromancapiii@, the size of $\bf{Z}$ is constrained by $N<T$ . In practice, the number of measured variables can easily be tens of thousands, so the measurement samples should span a time window of commensurable size. This may take a long time during which the system topology may have changed. To address this problem, we divide measured variables into sets of appropriate size if there are too many measured variables, or the equipment for running algorithms is limited. For the sake of brevity, this case uses European 1354-bus high voltage transmission network, which contains 12015 measured variables, in order are active power to bus Pt, active power from bus Pf, injective active power Pb, reactive power to bus Qt, reactive power from bus Qf, injective reactive power Qb, nodal voltage magnitudes Vm. We divide them into seven groups according to their physical meaning. After cleaning all respective matrices, we gather them to operate WLS.

The explicit sizes of matrices and the results of cleaning are shown in Table. VII. After cleaning, the averaged MAE of the measured vector decreases from 0.0395 to 0.0097. The MAE of the state vector after R-WLS is 0.0014, while MAE is 0.0011 if measured variables are cleaned in one matrix. Therefore, by dividing the variables, we can speed up the operation without much worsening of effectiveness.

IV-F Comparison of Different Cleaning Methods

It is necessary to compare our approach with other RMT based cleaning methods. In this case, the nonlinear shrinkage (NLS) [33], loo-cross-validation covariance (CVC) [34] and isotonic regression based CVC (iso-CVC) [34] are compared. The simulation settings are the same as those in Section. @slowromancapiv@. B, and the results are shown in Table. VIII.

According to Table. VIII, the comparison results in power systems of different sizes are different. In IEEE 30-bus system, our method is superior to the other methods, which shows that our method has weaker requirement for matrix size ( $N$ and $T$ should be large enough). This requirement for matrix size, or more broadly the Kolmogorov limit, is general and fundamental for all RMT based methods. So it can be seen that our method is more tolerant for this crucial condition. As for the case of IEEE 57-bus, 118-bus and 300-bus systems, our method also obtains comparable results, demonstrating the effectiveness and competitiveness of our approach.

IV-G Case with time-varying operating state

In this case, we aim to illustrate the performance of the proposed method under the scenario of time-varying operating state. We use IEEE 300-bus system, and the same settings of measurement error and initialization method are adopted. The active load of node 6 and node 21 is selected to be time-varying, while the others remain constant. The details of the load settings are shown in Fig. 7(a).

The MAE curve over time is shown in Fig. 7(b). According to the results, the effectiveness of our approach remains almost independent of the time-varying operating state. This is because our cleaning method rebuilds the measurement model in matrix-level, and it is based on the determinacy of the eigenvalues of $\bf{G}$ in the limit of large dimension. Therefore, whatever the matrix $\bf{H}$ is, as long as $\bf{H}$ follows the Kolmogorov limit, our method is capable of working normally. Besides, the results fluctuate from around 0.00075 to 0.00095, because the measurement noise imposed in each experiment is randomly generated.

IV-H Case of Inaccurate Variance

As shown in the second assumption, the variance of measurement error is assumed to be known. However, in practice, the variance estimation may fail to obtain accurate results, due to the lack of historical data, sudden events and some systematic errors. Thus in this case, we examine the performance of our approach when the estimated variance of measurement error is also biased from the accurate variance. We use European 1354-bus, Polish 3120-bus and IEEE 300-bus system, and adopt the same hyper parameters and simulation settings with the above cases. The error ratios of variance estimation range from 0 to 50% of the real variance. The testing results are shown in Fig. 8.

From the results, the influence of inaccurate variance estimation decreases with the system scale increasing, especially when the variance error ratio is great. Besides, even though the inaccurate variance estimation generally reduces the effectiveness of our method, the MAE of final results is still better than those without cleaning.

V Conclusion

In this paper, a new opinion was put forward on the key problem of state estimation, that is, estimated error comes from the residual and measurement error, but traditional WLS does not take the effect of measurement error into account. Therefore, we proposed a data-driven method to overcome this problem by cleaning measurement error at first. Combined with WLS, a two-stage state estimation model was conducted. Our method is based on the deterministic property about eigenvalues distribution of the fully stochastic measurement error in matrix-level, and it is the first time to use RMT based noise-cleaning method to improve state estimation. Additionally, another innovation is the Hermitian matrix construction, which is a kind of extension of previous cleaning models for covariance matrices, so that they can be applied to clean measurement matrices of power systems. Our method not only has strict mathematical deductions and precise theoretical supports, but also performs well in practical applications. The numerous tests, using different testing systems and various hyper-parameters, proved the effectiveness and advantages of our method. In the future, we will attempt to clean measurement noise in the iterative process of WLS.

Appendix A Random Matrices Basics

Theorem 1: M-P Law For an $N\times T$ matrix $\bf{G}$ whose entries follow an identical and independent Gaussian distribution $\mathcal{N}(0,\sigma^{2})$ , $N\rightarrow\infty$ , $T\rightarrow\infty$ and $T\sim O(N)$ . Then the spectrum distribution of its covariance matrix ${\bf{GG}}^{T}/T$ asymptotically follows the M-P distribution [11]:

[TABLE]

where $a<\lambda<b$ , $q=T/N$ , $a=\sigma^{2}(1-\sqrt{q})^{2}$ and $b=\sigma^{2}(1+\sqrt{q})^{2}$ .

Theorem 2: Stieltjes Transform One of the most general resolvent of the covariance matrix $\bf{E}$ is:

[TABLE]

where $s\in\mathbb{C^{+}}$ is a complex variable, and ${\bf{I_{N}}}\in\mathbb{R}^{N\times N}$ is the unit matrix. The Stieltjes transform is defined as [29]:

[TABLE]

where $Tr(\cdot)$ is the trace of $\bf{E}$ . $N$ denotes the number of rows of $\bf{E}$ , while $\lambda_{i}$ is the $i$ - $th$ eigenvalue of $\bf{E}$ .

Theorem 3: Inverse Stieltjes Transform The inverse transform of the Stieltjes transform is;

[TABLE]

where $\eta$ is a small integer closed to zero, $Im[\cdot]$ denotes the image part.

Theorem 4: R-Transform The R-transform is defined as:

[TABLE]

where $<\cdot>^{-1}$ represents the inverse function. The R-transform of a sum of matrices is equal to the sum of their respective R-transforms [29]:

[TABLE]

The R-transform is a great tool to analyze additive Gaussian errors, since by the R-transform we can obtain the spectrum distribution of $\bf{A+B}$ from $\bf{A}$ and $\bf{B}$ .

Theorem 5: S-Transform The S-transform allows us to compute the eigenvalues distribution of matrices product. The S-transform is defined as:

[TABLE]

where $\Gamma(s)=sg_{E}(s)-1$ . The S-transform of a product of matrices is identical to the product of their respective S-transforms:

[TABLE]

From the above basic equations, the eigenvalues distribution of a sum or a product of different or identical matrices can be fully computable. So reconsidering the model (18), the eigenvalues of each term can be analytically connected.

Appendix B Proof of Function (25)

According to function (22) and (32), and the equation $G_{E}(z)=(z(1-qg_{E}(z))-(1-q)-(1-qg_{E}(z))^{-1}{\bf{C}})^{-1}$ , we have:

[TABLE]

And the right part is:

[TABLE]

One also has $Tr(G_{E}(s){\bf{C}})=N(Z(s)g_{E}(s)-1)$ , where $Z(s)=1-qg_{E}(s)$ . So substitution of this and (24) into (38) yields:

[TABLE]

Then by substituting function (39) into (37), we obtain the function (25).

Appendix C Normal-inverse Gaussian Distribution

The p.d.f of a NIG distribution is [35]:

[TABLE]

where $K_{1}(\cdot)$ denotes the modified Bessel function, and in this case $\alpha=1$ , $\beta=1$ , $\delta=1$ , $\gamma=1$ and $\mu=-1$ . The coefficients make the mean and variance of this NIG distribution become zero and $\sigma_{i}$ .

Appendix D Supplementary Material

This supplementary material explains what role the Kolmogolov limit plays in our work. Normally, the covariance matrix ${\bf{E=ZZ}}^{H}={\bf{(H+G)(H+G)}}^{H}$ is estimated by $\mathbb{E}({\bf{E}})=\mathbb{E}({\bf{HH}}^{H}+{\bf{GH}}^{H}+{\bf{HG}}^{H}+{\bf{GG}}^{H})={\bf{HH}}^{H}+{\bf{I_{N}}}$ . However this well-known model is only accurate in an almost impossible condition: the number of samplings is much greater than the number of variables, i.e., $T>>N$ . In the case of $N\sim O(T)$ , this model is no longer fully trusted, since the eigenvalues distribution converges to the M-P law (29) rather than to units [36]. RMT exactly derives some deterministic properties of random matrices in the case of $N\sim O(T)$ , by which we improve the WLS based state estimation.

Bibliography36

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Y. Huang, S. Werner, J. Huang, N. Kashyap, and V. Gupta, “State estimation in electric power grids: Meeting new challenges presented by the requirements of the future grid,” IEEE Signal Processing Magazine , vol. 29, no. 5, pp. 33–43, 2012.
2[2] A. Monticelli, “Electric power system state estimation,” Proceedings of the IEEE , vol. 88, no. 2, pp. 262–282, 2000.
3[3] W. Zheng, W. Wu, A. Gomez-Exposito, B. Zhang, and Y. Guo, “Distributed robust bilinear state estimation for power systems with nonlinear measurements,” IEEE Transactions on Power Systems , vol. 32, no. 1, pp. 499–509, 2017.
4[4] Y. Weng, R.Negi, C. Faloutsos, and M. D. Ilic, “Robust data-driven state estimation for smart grid,” IEEE Transactions on Smart Grid , vol. 8, no. 4, pp. 1956–1967, 2017.
5[5] V. Kekatos and G. B. Giannakis, “Distributed robust power system state estimation,” IEEE Transactions on Power Systems , vol. 28, no. 2, pp. 1617–1626, 2013.
6[6] M. Gol and A. Abur, “Lav based robust state estimation for systems measured by pmus,” IEEE Transactions on Smart Grid , vol. 5, no. 4, pp. 1808–1814, 2014.
7[7] W. Xu, M. Wang, J. Cai, and A. Tang, “Sparse error correction from nonlinear measurements with applications in bad data detection for power networks,” IEEE Transactions on Signal Processing , vol. 61, no. 24, pp. 6175–6187, 2013.
8[8] V. M. Marco A.M. Saran, “State estimation pre-filtering with overlapping tiling of autoencoders,” Electric Power Systems Research , vol. 157, pp. 261 – 271, 2018.