# Performance and Resilience of Cyber-Physical Control Systems with   Reactive Attack Mitigation

**Authors:** Subhash Lakshminarayana, Jabir Shabbir Karachiwala, Teo Zhan Teng, Rui, Tan, David K.Y. Yau

arXiv: 1904.09445 · 2019-04-23

## TL;DR

This paper analyzes the resilience of power grid control systems against cyber attacks, proposing an optimal attack model using MDP and Q-learning, and evaluating the impact of detection errors through extensive simulations.

## Contribution

It introduces a Markov decision process framework for optimal attack design and applies reinforcement learning techniques to high-dimensional systems, advancing understanding of attack impacts.

## Key findings

- Optimal attack sequences maximize state estimation errors.
- Detection errors significantly influence system resilience.
- Q-learning approaches outperform traditional methods in complex systems.

## Abstract

This paper studies the performance and resilience of a linear cyber-physical control system (CPCS) with attack detection and reactive attack mitigation in the context of power grids. It addresses the problem of deriving an optimal sequence of false data injection attacks that maximizes the state estimation error of the power system. The results provide basic understanding about the limit of the attack impact. The design of the optimal attack is based on a Markov decision process (MDP) formulation, which is solved efficiently using the value iteration method. We apply the proposed framework to the voltage control system of power grids and run extensive simulations using PowerWorld. The results show that our framework can accurately characterize the maximum state estimation errors caused by an attacker who carefully designs the attack sequence to strike a balance between the attack magnitude and stealthiness, due to the simultaneous presence of attack detection and mitigation. Moreover, based on the proposed framework, we analyze the impact of false positives and negatives in detecting attacks on the system performance. The results are important for the system defenders in the joint design of attack detection and mitigation to reduce the impact of these attack detection errors.Finally, as MDP solutions are not scalable for high-dimensional systems, we apply Q-learning with linear and non-linear (neural networks based) function approximators to solve the attacker's problem in these systems and compare their performances.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.09445/full.md

## Figures

34 figures with captions in the complete paper: https://tomesphere.com/paper/1904.09445/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/1904.09445/full.md

---
Source: https://tomesphere.com/paper/1904.09445