Optimal scheduling strategy for networked estimation with energy   harvesting

Marcos M. Vasconcelos; Mukul Gagrani; Ashutosh Nayyar and; Urbashi Mitra

arXiv:1908.06070·eess.SY·August 19, 2019

Optimal scheduling strategy for networked estimation with energy harvesting

Marcos M. Vasconcelos, Mukul Gagrani, Ashutosh Nayyar and, Urbashi Mitra

PDF

TL;DR

This paper develops an optimal scheduling and estimation strategy for a networked system with energy-harvesting sensors transmitting over bandwidth-limited wireless channels, ensuring minimal mean-squared error.

Contribution

It introduces a globally optimal scheduling and estimation framework for energy-harvesting sensors with a threshold-based policy structure under symmetric unimodal data distributions.

Findings

01

Optimal policies are derived under symmetry and unimodality assumptions.

02

The scheduling policy is characterized by a threshold function depending on time and energy.

03

A recursive algorithm for computing the threshold is provided.

Abstract

Joint optimization of scheduling and estimation policies is considered for a system with two sensors and two non-collocated estimators. Each sensor produces an independent and identically distributed sequence of random variables, and each estimator forms estimates of the corresponding sequence with respect to the mean-squared error sense. The data generated by the sensors is transmitted to the corresponding estimators, over a bandwidth-constrained wireless network that can support a single packet per time slot. The access to the limited communication resources is determined by a scheduler who decides which sensor measurement to transmit based on both observations. The scheduler has an energy-harvesting battery of limited capacity, which couples the decision-making problem in time. Despite the overall lack of convexity of the team decision problem, it is shown that this system admits…

Equations191

\mathbb{I}\big{(}\mathfrak{S}\big{)}\operatorname{\overset{def}{=}}\begin{cases}1&\text{if}\ \ \mathfrak{S}\ \ \text{is true}\\ 0&\text{otherwise}.\end{cases}

\mathbb{I}\big{(}\mathfrak{S}\big{)}\operatorname{\overset{def}{=}}\begin{cases}1&\text{if}\ \ \mathfrak{S}\ \ \text{is true}\\ 0&\text{otherwise}.\end{cases}

U (E_{t}) = def {{0, 1, 2} {0} if E_{t} > 0 if E_{t} = 0.

U (E_{t}) = def {{0, 1, 2} {0} if E_{t} > 0 if E_{t} = 0.

E_{t + 1} = F (E_{t}, U_{t}, Z_{t}), t \in {1, \dots, T - 1},

E_{t + 1} = F (E_{t}, U_{t}, Z_{t}), t \in {1, \dots, T - 1},

\mathcal{F}(E_{t},U_{t},Z_{t})\operatorname{\overset{def}{=}}\min\big{\{}E_{t}-\mathbb{I}\big{(}U_{t}\neq 0\big{)}+Z_{t},B\big{\}},

\mathcal{F}(E_{t},U_{t},Z_{t})\operatorname{\overset{def}{=}}\min\big{\{}E_{t}-\mathbb{I}\big{(}U_{t}\neq 0\big{)}+Z_{t},B\big{\}},

h^{i} (X_{t}^{i}, U_{t}) = def {X_{t}^{i} \emptyset if U_{t} = i if U_{t} \neq = i .

h^{i} (X_{t}^{i}, U_{t}) = def {X_{t}^{i} \emptyset if U_{t} = i if U_{t} \neq = i .

U_{t} = f_{t} (X_{1 : t}, E_{1 : t}, Y_{1 : t - 1}) .

U_{t} = f_{t} (X_{1 : t}, E_{1 : t}, Y_{1 : t - 1}) .

\hat{X}_{t}^{i} = g_{t}^{i} (Y_{1 : t}^{i}) .

\hat{X}_{t}^{i} = g_{t}^{i} (Y_{1 : t}^{i}) .

\mathcal{J}\big{(}\mathbf{f},\mathbf{g}^{1},\mathbf{g}^{2}\big{)}\operatorname{\overset{def}{=}}\sum_{t=1}^{T}\mathbb{E}\Bigg{[}\sum_{i\in\{1,2\}}\|X^{i}_{t}-\hat{X}^{i}_{t}\|^{2}+c\mathbb{I}(U_{t}\neq 0)\Bigg{]}.

\mathcal{J}\big{(}\mathbf{f},\mathbf{g}^{1},\mathbf{g}^{2}\big{)}\operatorname{\overset{def}{=}}\sum_{t=1}^{T}\mathbb{E}\Bigg{[}\sum_{i\in\{1,2\}}\|X^{i}_{t}-\hat{X}^{i}_{t}\|^{2}+c\mathbb{I}(U_{t}\neq 0)\Bigg{]}.

\hat{X}_{2} =

\hat{X}_{2} =

=

∥ x - a ∥ \leq ∥ y - a ∥ \Rightarrow π (x) \geq π (y), x, y \in R^{n} .

∥ x - a ∥ \leq ∥ y - a ∥ \Rightarrow π (x) \geq π (y), x, y \in R^{n} .

f^{\star}_{t}(\mathbf{x},e)\operatorname{\overset{def}{=}}\begin{cases}0,\ \ \text{if}\ \ \underset{i\in\{1,2\}}{\max}\big{\{}\|x^{i}-a^{i}\|\big{\}}\leq\tau^{\star}_{t}(e)\\ \arg\underset{i\in\{1,2\}}{\max}\|x^{i}-a^{i}\|,\ \ \ \ \ \text{otherwise,}\end{cases}

f^{\star}_{t}(\mathbf{x},e)\operatorname{\overset{def}{=}}\begin{cases}0,\ \ \text{if}\ \ \underset{i\in\{1,2\}}{\max}\big{\{}\|x^{i}-a^{i}\|\big{\}}\leq\tau^{\star}_{t}(e)\\ \arg\underset{i\in\{1,2\}}{\max}\|x^{i}-a^{i}\|,\ \ \ \ \ \text{otherwise,}\end{cases}

g^{i\star}_{t}\big{(}y^{i})\operatorname{\overset{def}{=}}\begin{cases}x^{i}&\textup{if}\ \ y^{i}=x^{i}\\ a^{i}&\textup{if}\ \ y^{i}=\varnothing.\end{cases}

g^{i\star}_{t}\big{(}y^{i})\operatorname{\overset{def}{=}}\begin{cases}x^{i}&\textup{if}\ \ y^{i}=x^{i}\\ a^{i}&\textup{if}\ \ y^{i}=\varnothing.\end{cases}

U_{t}=f_{t}\big{(}\mathbf{X}_{t},E_{1:t},\mathbf{Y}_{1:t-1}\big{)}.

U_{t}=f_{t}\big{(}\mathbf{X}_{t},E_{1:t},\mathbf{Y}_{1:t-1}\big{)}.

P (S_{t + 1} ∣ S_{1 : t}, U_{1 : t}) = P (S_{t + 1} ∣ S_{t}, U_{t}) .

P (S_{t + 1} ∣ S_{1 : t}, U_{1 : t}) = P (S_{t + 1} ∣ S_{t}, U_{t}) .

ρ (S_{t}, U_{t})

ρ (S_{t}, U_{t})

\displaystyle\stackrel{{\scriptstyle(a)}}{{=}}\sum_{i\in\{1,2\}}\big{\|}X_{t}^{i}-g_{t}^{i}(Y_{1:t}^{i})\big{\|}^{2}+c\mathbb{I}(U_{t}\neq 0)

\displaystyle\stackrel{{\scriptstyle(b)}}{{=}}\sum_{i\in\{1,2\}}\big{\|}X_{t}^{i}-g_{t}^{i}\big{(}Y_{1:t-1}^{i},h^{i}(X_{t}^{i},U_{t})\big{)}\big{\|}^{2}+c\mathbb{I}(U_{t}\neq 0),

I_{t}^{S}

I_{t}^{S}

I_{t}^{E^{i}}

\bar{\mathcal{I}}_{t}^{\mathcal{E}^{1}}\operatorname{\overset{def}{=}}\big{\{}E_{1:t},\mathbf{Y}_{1:t-1},Y_{t}^{1}\big{\}}\

\bar{\mathcal{I}}_{t}^{\mathcal{E}^{1}}\operatorname{\overset{def}{=}}\big{\{}E_{1:t},\mathbf{Y}_{1:t-1},Y_{t}^{1}\big{\}}\

\bar{\mathcal{I}}_{t}^{\mathcal{E}^{2}}\operatorname{\overset{def}{=}}\big{\{}E_{1:t},\mathbf{Y}_{1:t-1},Y_{t}^{2}\big{\}}.

\bar{\mathcal{I}}_{t}^{\mathcal{E}^{2}}\operatorname{\overset{def}{=}}\big{\{}E_{1:t},\mathbf{Y}_{1:t-1},Y_{t}^{2}\big{\}}.

\mathcal{I}^{\mathrm{com}}_{t}\operatorname{\overset{def}{=}}\big{\{}E_{1:t},\mathbf{Y}_{1:t-1}\big{\}}.

\mathcal{I}^{\mathrm{com}}_{t}\operatorname{\overset{def}{=}}\big{\{}E_{1:t},\mathbf{Y}_{1:t-1}\big{\}}.

g_{t}^{i} (E_{1 : t}, Y_{1 : t - 1}, Y_{t}^{i}) = {X_{t}^{i} \tilde{g}_{t}^{i} (E_{1 : t}, Y_{1 : t - 1}) if Y_{t}^{i} = X_{t}^{i} otherwise .

g_{t}^{i} (E_{1 : t}, Y_{1 : t - 1}, Y_{t}^{i}) = {X_{t}^{i} \tilde{g}_{t}^{i} (E_{1 : t}, Y_{1 : t - 1}) if Y_{t}^{i} = X_{t}^{i} otherwise .

\inf_{g^{i}_{t}}\ \ \mathbb{E}\big{[}\|X_{t}^{i}-\hat{X}_{t}^{i}\|^{2}\big{]}+\tilde{\mathcal{J}},

\inf_{g^{i}_{t}}\ \ \mathbb{E}\big{[}\|X_{t}^{i}-\hat{X}_{t}^{i}\|^{2}\big{]}+\tilde{\mathcal{J}},

\tilde{\mathcal{J}}\operatorname{\overset{def}{=}}\mathbb{E}\Bigg{[}\sum_{k=1}^{T}c\mathbb{I}(U_{k}\neq 0)+\sum_{k=1}^{T}\sum_{j\neq i}\|X_{k}^{j}-\hat{X}_{k}^{j}\|^{2}\\ +\sum_{k\neq t}\|X_{k}^{i}-\hat{X}_{k}^{i}\|^{2}\Bigg{]}.

\tilde{\mathcal{J}}\operatorname{\overset{def}{=}}\mathbb{E}\Bigg{[}\sum_{k=1}^{T}c\mathbb{I}(U_{k}\neq 0)+\sum_{k=1}^{T}\sum_{j\neq i}\|X_{k}^{j}-\hat{X}_{k}^{j}\|^{2}\\ +\sum_{k\neq t}\|X_{k}^{i}-\hat{X}_{k}^{i}\|^{2}\Bigg{]}.

\inf_{g^{i}_{t}}\ \ \mathbb{E}\big{[}\|X_{t}^{i}-\hat{X}_{t}^{i}\|^{2}\big{]}.

\inf_{g^{i}_{t}}\ \ \mathbb{E}\big{[}\|X_{t}^{i}-\hat{X}_{t}^{i}\|^{2}\big{]}.

\hat{X}_{t}^{i}=\mathbb{E}\big{[}X_{t}^{i}\ \big{|}\ \bar{\mathcal{I}}_{t}^{\mathcal{E}^{i}}\big{]}.

\hat{X}_{t}^{i}=\mathbb{E}\big{[}X_{t}^{i}\ \big{|}\ \bar{\mathcal{I}}_{t}^{\mathcal{E}^{i}}\big{]}.

\displaystyle g_{t}^{i\star}(\bar{\mathcal{I}}_{t}^{\mathcal{E}^{i}})=\begin{cases}X_{t}^{i}&\textup{if}\ \ Y_{t}^{i}=X_{t}^{i}\\ \mathbb{E}\big{[}X_{t}^{i}\ \big{|}\ E_{1:t},\mathbf{Y}_{1:t-1},Y^{i}_{t}=\varnothing\big{]}&\textup{otherwise}.\end{cases}

\displaystyle g_{t}^{i\star}(\bar{\mathcal{I}}_{t}^{\mathcal{E}^{i}})=\begin{cases}X_{t}^{i}&\textup{if}\ \ Y_{t}^{i}=X_{t}^{i}\\ \mathbb{E}\big{[}X_{t}^{i}\ \big{|}\ E_{1:t},\mathbf{Y}_{1:t-1},Y^{i}_{t}=\varnothing\big{]}&\textup{otherwise}.\end{cases}

\tilde{g}_{t}^{i}(E_{1:t},\mathbf{Y}_{1:t-1})\operatorname{\overset{def}{=}}\mathbb{E}\big{[}X_{t}^{i}\ \big{|}\ E_{1:t},\mathbf{Y}_{1:t-1},Y^{i}_{t}=\varnothing\big{]}.

\tilde{g}_{t}^{i}(E_{1:t},\mathbf{Y}_{1:t-1})\operatorname{\overset{def}{=}}\mathbb{E}\big{[}X_{t}^{i}\ \big{|}\ E_{1:t},\mathbf{Y}_{1:t-1},Y^{i}_{t}=\varnothing\big{]}.

U_{t} = Γ_{t} (X_{t}) .

U_{t} = Γ_{t} (X_{t}) .

\hat{X}_{t}^{i} = {X_{t}^{i} \tilde{X}_{t}^{i} if Y_{t}^{i} = X_{t}^{i} otherwise .

\hat{X}_{t}^{i} = {X_{t}^{i} \tilde{X}_{t}^{i} if Y_{t}^{i} = X_{t}^{i} otherwise .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Optimal scheduling strategy for networked estimation with energy harvesting

Marcos M. Vasconcelos, Mukul Gagrani, Ashutosh Nayyar and Urbashi Mitra M. M. Vasconcelos, M. Gagrani, A. Nayyar and U. Mitra are with the Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, CA 90089 USA. E-mails: {mvasconc,mgagrani,ashutoshn,ubli}@usc.edu. The work of M. M. Vasconcelos and U. Mitra was supported in part by the following grants: ONR N00014-15-1-2550, NSF CCF-1718560, NSF CCF-1410009, NSF CPS-1446901, NSF CCF-1817200 and ARO W911NF1910269.

Abstract

Joint optimization of scheduling and estimation policies is considered for a system with two sensors and two non-collocated estimators. Each sensor produces an independent and identically distributed sequence of random variables, and each estimator forms estimates of the corresponding sequence with respect to the mean-squared error sense. The data generated by the sensors is transmitted to the corresponding estimators, over a bandwidth constrained wireless network that can support a single packet per time slot. The access to the limited communication resources is determined by a scheduler who decides which sensor measurement to transmit based on both observations. The scheduler has an energy-harvesting battery of limited capacity, which couples the decision-making problem in time. Despite the overall lack of convexity of the team decision problem, it is shown that this system admits a globally optimal scheduling and estimation strategies under the assumption that the distributions of the random variables at the sensors are symmetric and unimodal. Additionally, the optimal scheduling policy has a structure characterized by a threshold function that depends on the time index and energy level. A recursive algorithm for threshold computation is provided.

I Introduction

Reliable real-time wireless networking is an important requirement of modern cyber-physical and networked control systems [1, 2]. Due to their large scale, these systems are typically formed by multiple physically distributed subsystems, that communicate over a wireless network of limited capacity. One way to model this communication constraint is to assume that, at any time instant, only one packet can be reliably transmitted over the network to its destination. This constraint forces the system designer to use strategies that allocate the shared communication resources among multiple transmitting nodes. In addition to degrading the performance of the overall system, the fact that the communication among the different agents in cyber-physical systems is imperfect often leads to team-decision problems with nonclassical information structures. Such problems are usually non-convex, and are, in general, difficult to solve.

We consider a sequential remote estimation problem over a finite time horizon with non-collocated sensors and estimators. The system is comprised of multiple sensors, each of which has a stochastic process associated with it. Each sensor is paired with an estimator, which is interested in forming real-time estimates of its corresponding source process. The sensors communicate with their estimators via a shared communication network. However, at most one of the sensor’s observations can be transmitted at each time due to the limited capacity of the network. In order to avoid collisions [3, 4], the communication is mediated by a scheduler, acting as a network manager, who observes the realization of each source and decides at each time which one, if any, gets transmitted over the communication network. In addition to the communication constraint, the framework also assumes that the scheduler operates under an energy constraint by means of a finite battery, which is capable of harvesting additional energy from the environment.

The designer’s goal is to find scheduling and estimation strategies that jointly minimize an objective functional consisting of a mean-squared estimation error criterion and a communication cost. This is a team-decision problem with a non-classical information structure for which obtaining globally optimal solutions is a challenging task, in general [5]. However, under certain assumptions on the underlying probabilistic model, in spite of the difficulties imposed by lack of convexity this problem admits an explicit globally optimal solution, whose derivation is the the centerpiece of this article.

This problem is also motivated by applications such as the Internet of Things (IoT), where there exists a necessity to coordinate access to limited communication resources by multiple heterogeneous devices in real-time. In addition to that, in IoT applications, the network is expected to be able to support a massive number of devices for which the traditional scheduling techniques based on random access, collision resolution and retransmission are not feasibly implementable. Therefore, new scheduling schemes where decisions are driven by data such as the one proposed herein are becoming increasingly more relevant. This framework is also applicable to Wireless Body Area Networks, which are systems where multiple biometric sensors mounted on humans communicate with remote sensing stations over a wireless network [6, 7, 8]. In order to coordinate the access of the network among multiple sensors, a mobile phone is used as a hub, collecting data and choosing in real-time which one of the measurements is transmismitted over the network, thereby acting as a scheduler.

I-A Related literature

Sequential remote estimation with a single sensor and estimator has been a well-studied problem [9, 10, 11], for which jointly optimal sensor scheduling and estimation strategies have been derived under different structural assumptions. We are interested in problems where multiple sensors and estimators share the communication network. Systems with multiple control loops sharing a common communication network arise frequently in networked control and estimation. The authors of [12] proposed a scheduling protocol for a networked control system with multiple sensors and actuators sharing a common communication network and analyzed its stability. The performance of event-triggered control loops closed over a shared communication network was studied in [13] under different medium access control (MAC) protocols. The performance of an estimation problem using a contention-based MAC schemes, where each sensor can listen to the channel, before it communicates, was studied in [14]. The design of a wireless control system using random access was considered in [15], where communication policies at the sensors are designed to guarantee control performance by mitigating the effect of packet collisions. The concept of a scheduler (or network manager) that observes the state of multiple control loops and mediates access to the network was introduced in [16] and [17]. The problem of joint scheduling and remote estimation of two random variables with a single estimator was considered in [18], where person-by-person optimal solutions were obtained for the independent and symmetrically correlated Gaussian cases.

The problem of state estimation over a shared communication medium with multiple plants and estimators was considered in [19]. Our problem setup is similar to the one considered in [19]. However, the observations of each sensor in [19] are Gauss-Markov processes and the performance metric is a long term average cost. More importantly, [19] fixed the estimation strategy a priori and compared the performance of specific scheduling strategies (one of them consists in transmitting the state of the plant with the largest magnitude).

In a preliminary effort to obtain jointly optimal policies for the problem setup in [19], the work in [20] considered a one-shot problem of networked estimation and characterized jointly optimal scheduling and estimation strategies under certain assumptions on the probabilistic model of the sources. In this paper, we extend the optimality result in [20] to the sequential case over a finite horizon, when the state process of each sensor is independent of each other and is an independent and identically distributed (i.i.d.) process. However, there is a coupling of the decision-making problem accross the multiple stages due to the presence of an energy-harvesting battery of limited capacity. Our sequential problem is a team decision problem with non-classical information structure. Although such team problems are difficult to solve in general, under the condition that the sensors obervations are symmetric and unimodal probability density functions, we obtain a pair of jointly optimal scheduling and estimation strategies.

There exists an extensive literature on energy-harvesting transmitters in communications. This class of communication problems was introduced in [21] and captures a communication feature present in several mobile systems. However, the emphasis in that line of work is in maximizing information rates. The goal of this and related papers is on minimizing a combination of estimation error and communication cost, which indirectly affects the communication rate. In remote estimation, the energy-harvesting sensors have been considered in [10, 22, 11]. Related work in the field of energy-harvesting communications include [23, 24, 25, 26]. A recent survey of energy harvesting in communication and remote estimation can be found in [27].

I-B Contributions

The main contributions of this work are:

•

We establish the joint optimality of a pair of scheduling estimation strategies for a sequential problem formulation with i.i.d. sources and an energy-harvesting scheduler under symmetry and unimodality assumptions of the observations’ pdfs.

•

We provide a proof strategy that uses a combination of expansion of information structures and the common information approach. We show that the optimal solution of the relaxed problem also solves the original problem, and therefore, it is optimal.

•

We illustrate the our theoretical results with numerical examples.

I-C Organization

This paper is organized into 10 sections, including the introduction. We provide the preliminary definitions for the problem formulation in Section II, and the main results in Section III. We define a relaxation of the original problem and use the common information approach to write an equivalent POMDP in Sections IV and V, respectively. The solution to the coordinator’s POMDP is derived in Section VI. A numerical procedure for computing the optimal scheduling policies is provided in Section VII, and examples are provided in Section VIII. Our results are extended to an arbitrary number of sensors and to the case of unequal weights and communication costs in Section IX. We conclude in Section X, where we also point out open research directions.

I-D Notation

We adopt the following notation: random variables and random vectors are represented using upper case letters, such as $X$ . Realizations of random variables and random vectors are represented by the corresponding lower case letter, such as $x$ . We use $X_{a:b}$ to denote the collection of random variables $(X_{a},X_{a+1},\cdots,X_{b})$ . The probability density function (pdf) of a continuous random variable $X$ , provided that it is well defined, is denoted by $\pi$ . When a random variable is distributed according to a pdf $\pi$ Functions and functionals are denoted using calligraphic letters such as $\mathcal{F}$ . We use $\mathcal{N}(m,\sigma^{2})$ to represent the Gaussian probability distribution of mean $m$ and variance $\sigma^{2}$ , respectively. The real line is denoted by $\mathbb{R}$ . The set of natural numbers is denoted by $\mathbb{N}$ . The set of nonnegative integers is denoted by $\mathbb{Z}_{\geq 0}$ . Sets are represented in blackboard bold font, such as $\mathbb{A}$ . The probability of an event $\mathfrak{E}$ is denoted by $\mathbb{P}(\mathfrak{E})$ ; the expectation of a random variable $Z$ is denoted by $\mathbb{E}[Z]$ . The indicator function of a statement $\mathfrak{S}$ is defined as follows:

[TABLE]

We also adopt the following convention:

•

Consider the set $\mathbb{W}\operatorname{\overset{def}{=}}\{1,2,\cdots,N\}$ and a function ${\mathcal{F}:\mathbb{W}\rightarrow\mathbb{R}}$ are given. If $\overline{\mathbb{W}}$ is the subset of elements that maximize $\mathcal{F}$ then $\arg\max_{\alpha\in\mathbb{W}}\mathcal{F}(\alpha)$ is defined as the smallest number in $\overline{\mathbb{W}}$ .

II Problem statement

II-A Basic definitions

Consider a system with two sensor-estimator pairs and one energy harvesting scheduler. All the subsequent results hold for an arbitrary number of sensor-estimator pairs, a fact that will be formally stated in Section IX-A. Therefore, the focus on two sensor-estimator pairs is without loss of generality.

The system operates sequentially over a finite time horizon $T\in\mathbb{N}$ . The role of the scheduler is to mediate the communication between the sensors and estimators such that, at any given time step, at most one sensor-estimator pair is allowed to communicate. We proceed to define the stochastic processes observed at the sensors. Let $X^{i}_{t}\in\mathbb{R}^{n_{i}}$ denote the random vector observed at the $i$ -th sensor, $t\in\{1,\cdots,T\}$ , $i\in\{1,2\}$ . Let $n_{1}+n_{2}=n$ . We shall refer to $X^{i}_{t}$ , $i\in\{1,2\}$ , as outputs of information sources at time $t$ . Throughout the paper we assume that the sources are independent and identically distributed in time. Moreover, the random variables $X^{i}_{t}$ admit a pdf $\pi_{i}$ for all $i\in\{1,2\}$ and $t\in\{1,\cdots,T\}$ . We assume that the stochastic processes $\{X^{1}_{t},t\geq 1\}$ and $\{X^{2}_{t},t\geq 1\}$ are independent.

The scheduler operates with a battery of finite capacity denoted by $B\in\mathbb{N}$ such that $B<T$ . Let the state of the battery, $E_{t}$ , be defined as the number of energy units available at time step $t$ . At each time $t$ , the scheduler makes a decision $U_{t}\in\{0,1,2\}$ , where $U_{t}=0$ denotes that no transmissions are scheduled; $U_{t}=1$ denotes that the scheduler transmits $X^{1}_{t}$ ; and $U_{t}=2$ denotes that the scheduler transmits $X^{2}_{t}$ . Each transmission depletes the battery by one energy unit and only no transmissions can be scheduled if the battery is empty, i.e., if $E_{t}=0$ . Thus, the scheduling decision $U_{t}\in\mathbb{U}(E_{t})$ , where:

[TABLE]

At time $t$ , the scheduler harvests $Z_{t}$ units of energy from the environment. The random variable $Z_{t}$ is i.i.d. in time according to a probability mass function $p_{Z}(z),z\in\mathbb{Z}_{\geq 0}$ , and is independent of the information source processes. The state of the battery evolves according to the following equation:

[TABLE]

where

[TABLE]

and initial energy $E_{1}=B$ .

We will assume that the communication between the scheduler and the estimators occur over a so-called unicast network, where only the intended estimator receives the transmitted packet. For $i\in\{1,2\}$ , the observation of the estimator $\mathcal{E}^{i}$ at time $t$ is denoted by $Y^{i}_{t}$ , which is determined according to $Y^{i}_{t}=h^{i}(X^{i}_{t},U_{t})$ , where:

[TABLE]

Remark 1

One way to think about the unicast network model is that there are independent point-to-point links between different sensor and estimator pairs. At each time instant the scheduler chooses at most one of these links to be active and the others remain idle. Unicast is one of the modes of operation in the current version of the internet protocol, IPv6.

II-B Information and strategies

Let $\mathbf{X}_{t}\operatorname{\overset{def}{=}}(X^{1}_{t},X^{2}_{t})$ and $\mathbf{Y}_{t}\operatorname{\overset{def}{=}}(Y^{1}_{t},Y^{2}_{t})$ . The scheduler decides what to transmit based on its available information at time $t$ , which is $\mathcal{I}^{\mathcal{S}}_{t}\operatorname{\overset{def}{=}}\{\mathbf{X}_{1:t},E_{1:t},\mathbf{Y}_{1:t-1}\}$ . The decision variable $U_{t}$ is computed according to a function $f_{t}$ as follows:

[TABLE]

We refer to the collection $\mathbf{f}\operatorname{\overset{def}{=}}\{f_{1},\cdots,f_{T}\}$ as the scheduling strategy of the network manager.

Let $i\in\{1,2\}$ . The estimator $\mathcal{E}^{i}$ computes the state estimate based on the entire history of its observations, $\mathcal{I}_{t}^{\mathcal{E}^{i}}\operatorname{\overset{def}{=}}\{Y_{1:t}^{i}\}$ , according to a function $g^{i}_{t}$ as follows:

[TABLE]

We refer to the collection $\mathbf{g}^{i}\operatorname{\overset{def}{=}}\{g^{i}_{1},\cdots,g^{i}_{T}\}$ as the estimation strategy of estimator $\mathcal{E}^{i}$ .

Remark 2

From now on, we assume that $f_{t}$ , $g^{1}_{t}$ and $g^{2}_{t}$ , $t\in\{1,\cdots,T\}$ , are measurable functions with respect to the appropriate sigma-algebras.

II-C Cost

We consider a performance index which penalizes the mean squared estimation error and a communication cost for every transmission made by the scheduler. The cost functional and optimization problem are defined as follows:

[TABLE]

Problem 1

For the model described in this section, given the statistics of the sensor’s observations, the statistics of the energy-harvesting process, the battery storage limit $B$ , communication cost $c$ , and the horizon $T$ , find scheduling and estimation strategies $\mathbf{f},\mathbf{g}^{1}$ and $\mathbf{g}^{2}$ that jointly minimize the cost $\mathcal{J}(\mathbf{f},\mathbf{g}^{1},\mathbf{g}^{2})$ in Eq. 8.

II-D Signaling

In problems of decentralized control and estimation with non-classical information structures, the optimal solutions typically involve a form of implicit communication known as signaling. Signaling is the effect of conveying information through actions [28], and it is the reason why problems within this class are difficult to solve, e.g. [29].

In order to illustrate the fundamental difficulty imposed by signaling, consider an instance of 1 with two zero-mean independent scalar sources and $T=1$ . Assume that the scheduler makes its decision according to the partition of the observation space shown in Fig. 2, where $\mathbb{A}_{0}\cup\mathbb{A}_{1}\cup\mathbb{A}_{2}=\mathbb{R}^{2}$ and $(x_{1},x_{2})\in\mathbb{A}_{i}$ implies that $U=i$ , $i\in\{0,1,2\}$ . Suppose that the scheduler observes $(x_{1},x_{2})\in\mathbb{A}_{1}$ , which implies that $U=1$ and consequently, $Y_{1}=x_{1}$ and $Y_{2}=\varnothing$ . The optimal estimate used by $\mathcal{E}_{2}$ in this case is

[TABLE]

which may correspond to a different numerical value than if we simply used a naive estimate $\mathbb{E}[X_{2}]=0$ .

Therefore, the optimal estimation strategy depends on the scheduling strategy being used. This coupling between scheduling and estimation strategies is what makes 1 nontrivial even when $T=1$ .

III Main result

The following definition will be used to state our main result.

Definition 1 (Symmetric and unimodal probability density functions)

Let $\pi:\mathbb{R}^{n}\rightarrow\mathbb{R}$ be a probability density function (pdf). The pdf $\pi$ is symmetric and unimodal around $a\in\mathbb{R}^{n}$ if it satisfies the following property:

[TABLE]

Theorem 1

Provided that $\pi_{1}$ and $\pi_{2}$ are symmetric and unimodal around $a^{1}\in\mathbb{R}^{n_{1}}$ and $a^{2}\in\mathbb{R}^{n_{2}}$ , respectively, the strategy profile $\big{(}\mathbf{f}^{\star},\mathbf{g}^{1\star},\mathbf{g}^{2\star}\big{)}$ is globally optimal for 1, where $\mathbf{f}^{\star}$ is a vector whose $t$ -th component is defined as:

[TABLE]

where $\tau^{\star}_{t}:\mathbb{Z}\rightarrow\mathbb{R}$ ; and $\mathbf{g}^{i\star}$ is a vector whose $t$ -th component is defined as:

[TABLE]

IV Information Structures

Problem 1 can be understood as a sequential stochastic team with three decision makers: the scheduler and the two estimators. One key aspect to note is that Problem 1 has a non-classical information structure. Such team problems are usually non-convex and their solutions are found on a case-by-case basis. Our analysis relies on the common information approach [30], where the idea is to transform the decentralized problem into an equivalent centralized one where the information for decision-making is the common information among all the decision makers in the decentralized system.

We begin by establishing a structural result for the optimal scheduling strategy. The following lemma states that the scheduler may ignore the past state observations at each sensor and the past states of the battery without any loss of optimality.

Lemma 1

Without loss of optimality, the scheduler can be restricted to strategies of the form:

[TABLE]

Proof:

Let the strategy profile of the estimators $\mathbf{g}^{1}$ and $\mathbf{g}^{2}$ be arbitrarily fixed. The problem of selecting the best scheduling policy (for the fixed estimation strategy profiles $\mathbf{g}^{1}$ and $\mathbf{g}^{2}$ ) simplifies to a Markov Decision Process (MDP), whose state is defined as $S_{t}\operatorname{\overset{def}{=}}(\mathbf{X}_{t},E_{1:t},\mathbf{Y}_{1:t-1})$ . Using simple arguments involving conditional probabilities and the basic definitions of Section II-A, we can show that the state process $\{S_{t},t\geq 1\}$ is a controlled Markov chain, i.e.,

[TABLE]

The cost incurred at time $t$ of the equivalent MDP is:

[TABLE]

where $(a)$ follows from Eq. 7 and $(b)$ follows from Eq. 5.

Thus, the problem of finding the optimal scheduling strategy to minimize the cost $\mathcal{J}\big{(}\mathbf{f},\mathbf{g}^{1},\mathbf{g}^{2}\big{)}$ becomes equivalent to finding the optimal decision strategy for an MDP with state process $S_{t}$ and instantaneous cost $\rho(S_{t},U_{t})$ . Standard results for MDPs [31] imply that there exists an optimal scheduling strategy of the form in lemma. Since this is true for any arbitrary $\mathbf{g}^{1}$ and $\mathbf{g}^{2}$ , it is also true for the globally optimal $\mathbf{g}^{1\star}$ and $\mathbf{g}^{2\star}$ . ∎

Under the structural result in Lemma 1, the information sets available at the network manager and estimators can be reduced to:

[TABLE]

without any loss of optimality. However, the information structure described by Eqs. 15 and 16 do not share any common information. In other words, the information sets $\mathcal{I}_{t}^{\mathcal{S}}$ , $\mathcal{I}_{t}^{\mathcal{E}^{1}}$ and $\mathcal{I}_{t}^{\mathcal{E}^{2}}$ have no common random variables, a fact that limits the utility of the common information approach. We resort to a technique which consists of judiciously expanding the information available at the decision makers such that the common information approach can be more profitably employed.

IV-A Information structure expansion

We expand the estimators’ information sets to the following:

[TABLE]

The optimal cost for Problem 1 under an expanded information structure is at least as good as the optimal cost under the original information structure (having more information at each estimator cannot worsen its performance). Moreover, if the optimal solution under the expanded information structure is adapted to the original information structure, then this solution is also optimal under the original information structure [5, Proposition 3.5.1].

We proceed by defining another problem identical to Problem 1 but with expanded information sets at the estimators.

Problem 2

Consider the model of Section II with the expanded information sets of Eqs. 17 and 18 at the estimators $\mathcal{E}^{1}$ and $\mathcal{E}^{2}$ , respectively. Given the statistics of the sensors’ observations, the statistics of the energy harvested at each time, the battery storage limit $B$ , communication cost $c$ , and the horizon $T$ , find the scheduling and estimation strategies $\mathbf{f},\mathbf{g}^{1}$ and $\mathbf{g}^{2}$ that jointly minimize the cost $\mathcal{J}\big{(}\mathbf{f},\mathbf{g}^{1},\mathbf{g}^{2}\big{)}$ in Eq. 8.

Under the expanded information structure, the common information among the decision makers is:

[TABLE]

Notice that the common information contains several variables which were not originally available to the estimators. However, we will show that the optimal estimation strategy for 2 does not depend on this additional information.

The following lemma provides a structural result for the estimation strategies under the expanded information sets.

Lemma 2

Without loss of optimality, the search for optimal strategies for estimator $\mathcal{E}^{i}$ can be restricted to functions of the form:

[TABLE]

Proof:

Let the strategy of the scheduler be fixed to some arbitrary $\mathbf{f}$ . We can view 2 from the perspective of the estimator $\mathcal{E}^{i}$ at time $t$ as follows:

[TABLE]

where

[TABLE]

Since $g_{t}^{i}$ does not affect the term $\tilde{\mathcal{J}}$ , the optimal estimate can be computed by solving:

[TABLE]

This is the standard MMSE estimation problem whose solution is the conditional mean, i.e.,

[TABLE]

Therefore, the optimal estimation strategy is of the form:

[TABLE]

Notice that $(E_{1:t},\mathbf{Y}_{1:t-1})$ is known to $\mathcal{E}^{i}$ in Problem 2. Thus,

[TABLE]

Since Eq. 25 holds for any $\mathbf{f}$ , it also holds for the globally optimal scheduling strategy $\mathbf{f^{\star}}$ . Therefore, the optimal estimate is of the form given in the lemma. ∎

V An equivalent problem with a coordinator

In this section, we will formulate a problem which will be used to solve 2. We consider the model of Section II and introduce a fictitious decision maker referred to as the coordinator, which has access to the common information $\mathcal{I}^{\mathrm{com}}_{t}$ . The coordinator is the only decision maker in the new problem. The scheduler and the estimators act as “passive decision makers” to which strategies chosen by the coordinator are prescribed.

The equivalent system operates as follows: At each time $t$ , based on $\mathcal{I}_{t}^{\mathrm{com}}$ , the coordinator chooses a map $\Gamma_{t}:\mathbb{R}^{n_{1}}\times\mathbb{R}^{n_{2}}\rightarrow\{0,1,2\}$ for the network manager, and a vector $\tilde{X}_{t}^{i}\in\mathbb{R}^{n_{i}}$ for each estimator $\mathcal{E}^{i}$ , $i\in\{1,2\}$ . The function $\Gamma_{t}$ and vectors $\tilde{X}_{t}^{1}$ and $\tilde{X}_{t}^{2}$ are referred to as the scheduling and estimation prescriptions. The scheduler uses its prescription to evaluate $U_{t}$ according to:

[TABLE]

The estimator $\mathcal{E}^{i}$ uses its prescription to compute the estimate $\hat{X}_{t}^{i}$ according to:

[TABLE]

The coordinator selects its prescriptions for the scheduler and the estimators using strategies $d_{t},\ell_{t}^{1}$ and $\ell_{t}^{2}$ as follows:

[TABLE]

and

[TABLE]

We refer to the collections $\mathbf{d}\operatorname{\overset{def}{=}}\{d_{1},\cdots,d_{T}\}$ and $\boldsymbol{\ell}^{i}\operatorname{\overset{def}{=}}\{\ell_{t}^{i},\cdots,\ell_{T}^{i}\}$ as the prescription strategies for the scheduler and the estimator $\mathcal{E}^{i}$ , respectively. The strategies $\boldsymbol{\ell}^{1}$ and $\boldsymbol{\ell}^{2}$ must be a valid estimation strategies in Problem 2. The strategy $\boldsymbol{d}$ must be such that

[TABLE]

is a valid scheduling strategy in Problem 2. The cost incurred by the prescription strategies $\mathbf{d},\boldsymbol{\ell}^{1}$ and $\boldsymbol{\ell}^{2}$ is identical as in Eq. 8, that is,

[TABLE]

Problem 3

Find prescription strategies $\mathbf{d},\boldsymbol{\ell}^{1}$ , and $\boldsymbol{\ell}^{2}$ that jointly minimize $\hat{\mathcal{J}}(\mathbf{d},\boldsymbol{\ell}^{1},\boldsymbol{\ell}^{2})$ .

Problem 3 is equivalent to Problem 2 in the sense that for every scheduling strategy $\mathbf{f}$ and estimation strategies $\mathbf{g}^{1},\mathbf{g}^{2}$ in Problem 2 there exist prescription strategies $\mathbf{d},\boldsymbol{\ell}^{1}$ and $\boldsymbol{\ell}^{2}$ such that $\mathcal{J}(\mathbf{f},\mathbf{g}^{1},\mathbf{g}^{2})=\hat{\mathcal{J}}(\mathbf{d},\boldsymbol{\ell}^{1},\boldsymbol{\ell}^{2})$ and vice-versa. Thus, solving Problem 3 allows us to obtain optimal $\mathbf{f}^{\star},\mathbf{g}^{1\star}$ and $\mathbf{g}^{2\star}$ for Problem 2. The same technique is used in [10] to prove a similar equivalence in a problem involving a single sensor-estimator pair.

Problem 3 can be described as a centralized POMDP as follows:

( $i$ )

State process:

The state is $S_{t}\operatorname{\overset{def}{=}}(\mathbf{X}_{t},E_{t})$ . 2. ( $ii$ )

**Action process:

**Let the set $\mathbb{A}(E_{t})$ be defined as the collection of all measurable functions from $\mathbb{R}^{n_{1}}\times\mathbb{R}^{n_{2}}\rightarrow\mathbb{U}(E_{t})$ , where $\mathbb{U}$ is defined in Eq. 2. The coordinator selects the prescription for the network manager, $\Gamma_{t}\in\mathbb{A}(E_{t})$ , and the prescriptions for the estimators $\tilde{X}_{t}^{1}\in\mathbb{R}^{n_{1}}$ and $\tilde{X}_{t}^{2}\in\mathbb{R}^{n_{2}}$ . 3. ( $iii$ )

**Observations:

**After choosing its action at time $t$ , the coordinator observes $Y_{t}$ and $E_{t+1}$ . 4. ( $iv$ )

**Instantaneous cost:

**Let $\tilde{\mathbf{X}}_{t}\operatorname{\overset{def}{=}}(\tilde{X}^{1}_{t},\tilde{X}^{2}_{t})$ . The instantaneous cost incurred is given by

[TABLE] 5. ( $v$ )

Markovian dynamics:

Since $\mathbf{X}_{t}$ is an i.i.d process, $\mathbf{X}_{t+1}$ is independent of $S_{t}$ . The evolution of the energy $E_{t+1}$ is given by:

[TABLE]

Noticing that Eq. 34 can be written as a function of the state $S_{t}$ , action $\gamma_{t}$ and the noise $Z_{t}$ , the state $S_{t}$ satisfies Eq. 14 and forms a controlled Markov chain.

V-A Dynamic program

Having established that Problem 3 is a POMDP, the optimal prescriptions can be computed by solving a dynamic program whose information state is the belief of the state process given the common information. However, since $E_{t}$ is perfectly observed, the coordinator only needs to form a belief on $\mathbf{X}_{t}$ . Let $\mathbf{x}=(x^{1},x^{2})$ . We define the belief state at time $t$ as:

[TABLE]

Since the sources are i.i.d. and independent of the energy process, we have:

[TABLE]

where, due to the independence of the sources,

[TABLE]

Lemma 3

Define the functions $\mathcal{V}_{t}^{\pi}:\mathbb{Z}\rightarrow\mathbb{R}$ for $t\in\{0,1,\cdots,T+1\}$ as follows:

[TABLE]

and

[TABLE]

where $\tilde{\mathbf{x}}_{t}\in\mathbb{R}^{n}$ , $\gamma_{t}\in\mathbb{A}(e)$ .

If the infimum in Eq. 39 is achieved, then at each time $t\in\{1,\cdots,T\}$ and for each $e\in\{0,1,\cdots,B\}$ , the minimizing $\gamma_{t}$ and $\tilde{\mathbf{x}}_{t}$ in Eq. 39 determines the optimal prescriptions for the network manager and the estimators, respectively. Furthermore, $\mathcal{V}_{1}(B)$ is the optimal cost for Problem 3.

Proof:

This result follows from standard dynamic programming arguments for POMDPs. ∎

VI Solving the dynamic program

In this section, we will find the optimal prescriptions using the dynamic program in Lemma 3. For the remainder of this section, we will assume that $\pi_{1}$ and $\pi_{2}$ are symmetric and unimodal around [math]. The same arguments apply for general $a^{i}\in\mathbb{R}^{n_{i}}$ , $i\in\{1,2\}$ .

Note that each step of the dynamic program in Eq. 39 is an optimization problem with respect to $\mathbf{\tilde{x}}_{t}$ and $\gamma_{t}$ . This is an infinite-dimensional optimization problem since $\gamma_{t}$ is a mapping which lies in $\mathbb{A}(E_{t})$ . The next lemma will describe the structure of the optimal prescription for the scheduler and show that the infinite dimensional optimization in Eq. 39 can be reduced to a finite dimensional problem with respect to the vector $\mathbf{\tilde{x}}_{t}$ . For that purpose, we define the functions $\mathcal{C}_{t+1}^{0},\mathcal{C}_{t+1}^{1}:\mathbb{Z}\rightarrow\mathbb{R}$ as follows:

[TABLE]

Lemma 4

Suppose the prescription to the estimators are $\tilde{x}_{t}^{1},\tilde{x}_{t}^{2}$ at time $t$ . Then, the optimal prescription to the scheduler has the following form when $e>0$ :

[TABLE]

where $\tau^{\star}_{t}(e)\operatorname{\overset{def}{=}}\sqrt{\mathcal{C}_{t+1}^{1}(e)-\mathcal{C}_{t+1}^{0}(e)}$ 111The function $\mathcal{C}^{1}_{t+1}(e)$ is larger than $\mathcal{C}^{0}_{t+1}(e)$ . Therefore, the threshold $\tau^{\star}_{t}(e)$ is a real number for all $e\in\{1,\cdots,B\}$ and $t\in\{1,\cdots,T\}$ .. Moreover, the value function $\mathcal{V}_{t}^{\pi}$ of Lemma 3 can be obtained by solving the finite dimensional optimization in Eq. 43.

[TABLE]

Proof:

If $e=0$ , there is only one feasible scheduling policy:

[TABLE]

Therefore,

[TABLE]

If $e>0$ , the value function in Eq. 39 can be written as in Eq. 46.

[TABLE]

For any fixed $\tilde{x}^{i}_{t}\in\mathbb{R}^{n_{i}}$ , $i\in\{1,2\}$ , the scheduling prescription that achieves the minimum in the inner optimization problem in Eq. 46 is determined as follows:

•

$\gamma^{\star}_{t}(\mathbf{x}_{t})=0$ if and only if

[TABLE]

•

$\gamma^{\star}_{t}(\mathbf{x}_{t})=1$ if and only if

[TABLE]

and

[TABLE]

•

$\gamma^{\star}_{t}(\mathbf{x}_{t})=2$ if and only if

[TABLE]

and

[TABLE]

Therefore,

[TABLE]

Using the optimal scheduling prescription in Eq. 52, the value function becomes:

[TABLE]

∎

Lemma 4 implies that the optimal solution to 3 can be found by solving the finite dimensional optimization problem in Eq. 43. We will show that Eq. 43 admits a globally optimal solution under certain conditions on the probabilistic structure of the problem.

Lemma 5

Let $X^{1}_{t}$ and $X^{2}_{t}$ be independent continuous random vectors with pdfs $\pi_{1}$ and $\pi_{2}$ . Provided that $\pi_{1}$ and $\pi_{2}$ are symmetric and unimodal around [math], then $\tilde{\mathbf{x}}_{t}^{\star}=0$ is a global minimizer in Eq. 43 for all $e\in\{0,1,\cdots,B\}$ .

Proof:

The proof is in Appendix B. ∎

We are now ready to provide the proof of Theorem 1.

Proof:

We will first show that $(\mathbf{f}^{\star},\mathbf{g}^{1\star},\mathbf{g}^{2\star})$ as defined in Theorem 1 is globally optimal for 2.

The optimal prescriptions for 3 are obtained using Lemmas 4 and 5. The optimal prescription for the scheduler is given by:

[TABLE]

whose threshold functions $\tau^{\star}_{t}(e)$ can be computed recursively (see Section VII); and the optimal prescription for the estimators are:

[TABLE]

Therefore, using the equivalence between 2 and 3, the optimal strategy profiles for 2 are

[TABLE]

and

[TABLE]

Moreover, since the solution to 2, $(\mathbf{f}^{\star},\mathbf{g}^{1\star},\mathbf{g}^{2\star})$ does not depend on the additional information provided to the estimators and is adapted to the original information structure of the estimators in 1, it is also a globally optimal strategy profile for 1.

∎

VII Computation of optimal thresholds

Once the structural result in Theorem 1 is established, the optimal scheduling strategy is completely specified by the sequence of optimal threshold functions $\tau^{\star}_{t}$ , $t\in\{1,\cdots,T\}$ . The thresholds $\tau^{\star}_{t}(e)$ are obtained using the functions $\mathcal{C}_{t+1}^{0}(e),\mathcal{C}_{t+1}^{1}(e)$ in Eqs. 40 and 41. The functions $\mathcal{C}_{t}^{0}(\cdot),\mathcal{C}_{t}^{1}(\cdot)$ can be computed by computing the value functions $\mathcal{V}^{\pi}_{t}$ via a backward inductive procedure. Note that we can simplify the expression for the value function using Lemma 5 and Eq. 43 to:

[TABLE]

and

[TABLE]

The following algorithm outlines the recursive computation of the threshold function $\tau_{t}^{\star}$ :

Remark 3

The expectations in the algorithm are with respect to the random vectors $X^{1}_{t}$ and $X^{2}_{t}$ . Computing these expectations for high dimensional random vectors may be computationally intensive for some source distributions, but in practice they can be approximated using Monte Carlo methods. The remaining operations in the algorithm admit efficient implementations.

VIII Illustrative examples

VIII-A Optimal blind scheduling

Before we provide a few numerical examples it is useful to introduce a scheduling strategy which is based exclusively on the statistics of the sources, and not on the observations. Consider the following blind scheduling strategy: if the battery is not empty, transmit the source whose variance is the largest, i.e.,

[TABLE]

The estimation strategies associated with blind scheduling are:

[TABLE]

The performance of the blind scheduling and estimation strategies and the is given by:

[TABLE]

where the probabilities $\big{\{}\mathbb{P}(E_{t}=0)$ , $t\in\{1,\cdots,T\}\big{\}}$ are computed recursively using Eqs. 3 and 4 and assuming $E_{1}=B>0$ with probability $1$ .

Example 1 (Limited number of transmissions)

Consider the scheduling of two i.i.d. zero mean scalar Gaussian sources with variances $\sigma_{1}^{2}=\sigma_{2}^{2}=1$ . Assume that the total system deployment time is $T$ , and that during that time the scheduler is only allowed to transmit $B<T$ times. Furthermore, assume that during that time, there is no energy being harvested, i.e., $Z_{t}=0$ with probability $1$ , and there are no additional communication costs.

The algorithm outlined in Section VII is used to compute the optimal thresholds, which are functions of the time index, and the energy level at the battery. Figure 4 displays the optimal thresholds computed for this example with $T=100$ and $B=30$ .

Notice that when the energy level is greater than the remaining deployment time, the optimal threshold is zero, that is, the observation with the largest magnitude is always transmitted. On the other hand, if the power level is below the remaining deployment time, the optimal threshold is strictly positive and it increases as the power level decreases. That means that as the battery depletes, the scheduler will only transmit observations whose magnitudes are increasingly larger.

Example 2 (Energy harvesting scheduler)

Consider a setup identical to that in Example 1, but in addition assume that the energy harvesting process $Z_{t}$ is distributed according to two possible probability mass functions:

[TABLE]

yielding on average $0.2$ and $0.4$ energy units per time step, respectively.

The optimal thresholds obtained for the energy harvesting system under $p^{1}_{Z}$ are shown in Fig. 5, and they are uniformly smaller than the ones of the system without harvesting. We also note a change in the “curvature” of the threshold function for a fixed $t$ .

Figure 6 shows the performance of the optimal strategy and the blind scheduling scheme as a function of the battery capacity $B$ for the three systems: no harvesting, harvesting with $p_{Z}^{1}$ and $p_{Z}^{2}$ . The optimal scheme proposed in this paper leads to a significant improvement upon the blind scheduling strategy of Eq. 60. The gap in performance between optimal open-loop and closed-loop strategies is defined as the “Value of information” ( $\mathrm{VoI}$ ). Mathematically, the VoI is given by:

[TABLE]

Figure 7 illustrates the VoI as a function of the battery capacity $B$ . Notice that this function is not monotonic increasing in the battery capacity. Therefore, there exists a nontrivial optimal value $B^{\star}$ such that the $\mathrm{VoI}$ is maximized. In Example 1, the value of $B$ that maximizes the VoI in the system without energy harvesting is $B^{\star}=55$ .

For $B=10$ , without energy harvesting, the optimal performance is $\mathcal{J}^{\star}\approx 147.37$ . However, in order to achieve a comparable performance using blind scheduling, a battery of capacity equal to $53$ energy units would be required. Therefore, the energy savings in this case is of approximately $81.13\%$ .

IX Extensions

IX-A The $N$ sensor case

Theorem 1 holds for any number of sensors ( $N\geq 2$ ). Let $\mathbf{x}_{t}=(x_{t}^{1},x_{t}^{2},\cdots,x_{t}^{N})$ , where $x_{t}^{i}\in\mathbb{R}^{n_{i}}$ is the observation at the $i$ -th sensor. Provided that the observations are mutually independent and, their pdfs are symmetric and unimodal around $a^{1},a^{2},\cdots,a^{N}$ , where $a_{i}\in\mathbb{R}^{n_{i}}$ , $i\in\{1,2,\cdots,N\}$ , the jointly optimal scheduling and estimation strategies are:

[TABLE]

and

[TABLE]

IX-B Unequal weights and communication costs

In certain applications each sensor may be assigned a different weight in the expected distortion metric. This is done to emphasize the importance of the observations made by one sensor relative to another. Additionally, different sensors may also have different communication costs, which may reflect the dimension of the measurements or used to preserve the battery power, for instance. These cases are captured by the following cost functional:

[TABLE]

The globally optimal scheduling and estimation strategies for the more general cost functional in Eq. 67 are given by Eqs. 68 and 69. In order to illustrate the main difference from the case with uniform weights, consider Fig. 8, which shows a “no-transmission region” characterized by a rectangle defined by threshold functions $(\tau^{1}_{t}(e),\tau^{2}_{t}(e))$ and a hyperbola that separates the regions associated with scheduling sensors 1 and 2. In contrast, the “no-transmission region” for the uniform case is characterized by a square defined by a single threshold $\tau_{t}(e)$ and a hyperplane which separates the transmission regions for sensors 1 and 2.

X Conclusions

This paper studies the problem of optimal scheduling in a sequential remote estimation system where non-collocated sensors and estimators communicate over a shared medium. The access to the communication resources is granted by an energy-harvesting scheduler, which implements an observation-driven medium access control scheme in order to avoid packet collisions. The underlying assumption is that the sensors make measurements that are independent and identically distributed in time, but the energy level at the scheduler has a stochastic dynamics, which couples the decision-making process in time. The optimal solutions to such remote estimation problems are typically very difficult to find due to the presence of signaling between the scheduler and estimators.

The main result herein is to establish, under certain assumptions on the probabilistic model of the sources, the joint optimality of a pair of scheduling and estimation strategies. More importantly, the globally optimal solution is obtained in spite of the lack of convexity in the objective function introduced by signaling. The overarching proof consists of a judicious expansion of the information sets at the estimators, which enables the use of the common information approach to solve a single dynamic program from the perspective of a fictitious coordinator. Finally, by noticing that the optimal solution to this “relaxed” problem does not depend on the additional information introduced in the expansion, it is also shown to be optimal for the original optimization problem. As a byproduct, our proof technique also applies to more general settings with an arbitrary number of sensors, unequal weights and communication costs.

Future work in this problem includes the scheduling of correlated sources, but independent in time; independent Gauss-Markov sources (some progress in this area was reported in [32]); and networks prone to packet-drops.

Appendix A Auxiliary results

The following two definitions and theorem can be found in [33] and in [34].

Definition 2 (Symmetric rearrangement)

Let $\mathbb{A}$ be a measurable set of finite volume in $\mathbb{R}^{n}$ . Its symmetric rearrangement $\mathbb{A}^{*}$ is defined as the open ball centered at $0\in\mathbb{R}^{n}$ whose volume agrees with $\mathbb{A}$ .

Definition 3 (Symmetric decreasing rearrangement)

Let $f:\mathbb{R}^{n}\rightarrow\mathbb{R}$ be a nonnegative measurable function that vanishes at infinity. The symmetric decreasing rearrangement $f^{\downarrow}$ of $f$ is

[TABLE]

Theorem 2 (Hardy-Littlewood Inequality)

If $f$ and $g$ are two nonnegative measurable functions defined on $\mathbb{R}^{n}$ which vanish at infinity, then the following holds:

[TABLE]

where $f^{\downarrow}$ and $g^{\downarrow}$ are the symmetric decreasing rearrangements of $f$ and $g$ , respectively.

Appendix B Proof of Lemma 5

B-A Empty battery

Let $e=0$ . The value function in Eq. 43 is given by

[TABLE]

The infimum in the expression above is achieved by

[TABLE]

Since $\pi_{1}$ and $\pi_{2}$ are symmetric around [math],

[TABLE]

Therefore, if $e=0$ , the infimum in Eq. 43 is achieved by:

[TABLE]

B-B Nonempty battery

Let $e>0$ . The value function in Eq. 43 is given by

[TABLE]

The function $\mathcal{C}^{1}_{t+1}(e)$ is always larger than $\mathcal{C}^{0}_{t+1}(e)$ for all $e\in\{1,\cdots,B\}$ and $t\in\{1,\cdots,T\}$ . In order to establish this fact, notice that:

[TABLE]

where the inequality $(a)$ follows from the fact that $c\geq 0$ ; and inequality $(b)$ follows from the value functions $\mathcal{V}^{\pi}_{t}(e)$ being non-increasing in $e$ . This can be argued using the fact that, at any time step, having more energy available for transmission cannot reduce the optimal performance of the system.

The optimization problem in Eq. 76 is equivalent to:

[TABLE]

where

[TABLE]

Consider the auxiliary cost function $\mathcal{J}^{e}_{t}:\mathbb{R}^{n_{1}}\times\mathbb{R}^{n_{2}}\rightarrow\mathbb{R}$ defined as

[TABLE]

where the expectation is taken with respect to the random vectors $X^{1}_{t}$ and $X^{2}_{t}$ .

The remainder of the proof consists of solving the following optimization problem:

[TABLE]

Define the function $\mathcal{G}:\mathbb{R}^{n}\times\mathbb{R}^{n}\rightarrow\mathbb{R}$ such that

[TABLE]

Using the fact that $X_{t}^{1}$ and $X_{t}^{2}$ are independent, and the function $\mathcal{G}^{e}_{t}$ defined in Eq. 81, we can rewrite the function $\mathcal{J}^{e}_{t}(\tilde{\mathbf{x}}_{t})$ in integral form as:

[TABLE]

The function $\mathcal{G}^{e}_{t}$ can be alternatively represented as:

[TABLE]

The function in Eq. 83 is sketched in Fig. 9 as a function of $x_{t}^{1}$ while keeping $x_{t}^{2}$ and $\tilde{\mathbf{x}}_{t}$ fixed.

Finally, let the function $\mathcal{H}^{e}_{t}:\mathbb{R}^{n}\times\mathbb{R}^{n}\rightarrow\mathbb{R}$ be defined as:

[TABLE]

Notice that the function $\mathcal{H}^{e}_{t}$ vanishes as the norm of $x^{1}_{t}$ tends to infinity, i.e.,

[TABLE]

From the Hardy-Littlewood inequality (see Appendix A), we have:

[TABLE]

where $\pi_{1}^{\downarrow}$ and $\mathcal{H}^{e\downarrow}_{t}$ denote the symmetric decreasing rearrangements of $\pi_{1}$ and $\mathcal{H}^{e}_{t}$ , respectively. The following facts hold:

Since ${\pi_{1}}$ is symmetric and unimodal around [math],

[TABLE] 2. 2.

Since $\mathcal{H}^{e}_{t}(\tilde{\mathbf{x}}_{t};\mathbf{x}_{t})$ , as a function of $x^{1}_{t}$ , is symmetric and unimodal around $\tilde{x}^{1}_{t}$ (a fact that can be verified by inspection), we have:

[TABLE]

Therefore, the Hardy-Littlewood inequality implies that:

[TABLE]

which is equivalent to:

[TABLE]

Therefore,

[TABLE]

Fixing $\tilde{x}^{1\star}_{t}=0$ and following the same sequence of arguments exchanging the roles of $x^{1}_{t}$ and $x^{2}_{t}$ , we show that $\tilde{x}^{2\star}_{t}=0$ . Therefore,

[TABLE]

Bibliography34

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] K.-D. Kim and P. R. Kumar, “Cyber–physical systems: A perspective at the centennial,” Proceedings of the IEEE , vol. 100, pp. 1287–1308, May 2012.
2[2] A. Bemporad, M. Heemels, and M. Johansson, Networked control systems . Hidelberg: Springer, 2010, vol. 406.
3[3] M. M. Vasconcelos and N. C. Martins, “Optimal estimation over the collision channel,” IEEE Transactions on Automatic Control , vol. 62, no. 1, pp. 321–336, January 2017.
4[4] ——, “Optimal remote estimation of discrete random variables over the collision channel,” IEEE Transactions on Automatic Control , vol. 64, no. 4, pp. 1519 – 1534, April 2019.
5[5] S. Yuksel and T. Basar, Stochastic Networked Control Systems . Springer, 2013.
6[6] U. Mitra, A. Emken, S. Lee, M. Li, V. Rozgic, G. Thatte, H. Vathsangam, D.-S. Zois, M. Annavaram, S. Narayanan, M. Levorato, D. Spruijt-Metz, and G. S. Sukhatme, “KNOW-ME: a case study in wireless body area sensor network design,” IEEE Communications Magazine , vol. 50, no. 5, pp. 116–125, May 2012.
7[7] D.-S. Zois, M. Levorato, and U. Mitra, “Energy-efficient, heterogeneous sensor selection for physical activity detection in wireless body area networks,” IEEE Transactions on Signal Processing , vol. 61, no. 7, pp. 1581–1594, April 2013.
8[8] D.-S. Zois, “Sequential decision-making in healthcare iot: Real-time health monitoring, treatments and interventions,” in IEEE 3rd World Forum on Internet of Things , 2016.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Optimal scheduling strategy for networked estimation with energy harvesting

Abstract

I Introduction

I-A Related literature

I-B Contributions

I-C Organization

I-D Notation

II Problem statement

II-A Basic definitions

Remark 1

II-B Information and strategies

Remark 2

II-C Cost

Problem 1

II-D Signaling

III Main result

Definition 1** (Symmetric and unimodal probability density functions)**

Theorem 1

IV Information Structures

Lemma 1

Proof:

IV-A Information structure expansion

Problem 2

Lemma 2

Proof:

V An equivalent problem with a coordinator

Problem 3

V-A Dynamic program

Lemma 3

Proof:

VI Solving the dynamic program

Lemma 4

Proof:

Lemma 5

Proof:

Proof:

VII Computation of optimal thresholds

Remark 3

VIII Illustrative examples

VIII-A Optimal blind scheduling

Example 1** (Limited number of transmissions)**

Example 2** (Energy harvesting scheduler)**

IX Extensions

IX-A The NNN sensor case

IX-B Unequal weights and communication costs

X Conclusions

Appendix A Auxiliary results

Definition 2** (Symmetric rearrangement)**

Definition 3** (Symmetric decreasing rearrangement)**

Theorem 2** (Hardy-Littlewood Inequality)**

Appendix B Proof of Lemma 5

B-A Empty battery

B-B Nonempty battery

Definition 1 (Symmetric and unimodal probability density functions)

Example 1 (Limited number of transmissions)

Example 2 (Energy harvesting scheduler)

IX-A The $N$ sensor case

Definition 2 (Symmetric rearrangement)

Definition 3 (Symmetric decreasing rearrangement)

Theorem 2 (Hardy-Littlewood Inequality)