On Bayesian Fisher Information Maximization for Distributed Vector Estimation
Mojtaba Shirazi, Azadeh Vosoughi

TL;DR
This paper develops a Bayesian Fisher Information Maximization framework for distributed Gaussian vector estimation, optimizing sensor power allocation under various receiver models and demonstrating near-optimal performance compared to MSE minimization.
Contribution
It derives Bayesian FIM and WWB for different receiver types, formulates and solves power allocation optimization problems, and compares FIM-max schemes with MSE-min schemes in distributed estimation.
Findings
FIM-max power allocation improves estimation accuracy.
Solutions are distributed and depend on sensor quality and network constraints.
FIM-max schemes perform close to MSE-min schemes in practice.
Abstract
We consider the problem of distributed estimation of a Gaussian vector with linear observation model. Each sensor makes a scalar noisy observation of the unknown vector, quantizes its observation, maps it to a digitally modulated symbol, and transmits the symbol over orthogonal power-constrained fading channels to a fusion center (FC). The FC is tasked with fusing the received signals from sensors and estimating the unknown vector. We derive the Bayesian Fisher Information Matrix (FIM) for three types of receivers: (i) coherent receiver (ii) noncoherent receiver with known channel envelopes (iii) noncoherent receiver with known channel statistics only. We also derive the Weiss-Weinstein bound (WWB). We formulate two constrained optimization problems, namely maximizing trace and log-determinant of Bayesian FIM under network transmit power constraint, with sensors transmit powers being…
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
On Bayesian Fisher Information Maximization for Distributed Vector Estimation
Mojtaba Shirazi, Azadeh Vosoughi Parts of this research were presented at the IEEE 25th Annual International Symposium on Personal, Indoor, and Mobile Radio Communication, 2014, and the 48th Asilomar Conference on Signals, Systems and Computers, 2014 [1], [2]. This research is supported by the NSF under grants CCF-1341966 and CCF-1319770.
Abstract
In this paper we consider the problem of bandwidth-constrained distributed estimation of a Gaussian vector with linear observation model. Each sensor makes a scalar noisy observation of the unknown vector, employs a multi-bit scalar quantizer to quantize its observation, maps it to a digitally modulated symbol. Sensors transmit their symbols over orthogonal power-constrained fading channels to a fusion center (FC). The FC is tasked with fusing the received signals from sensors and estimating the unknown vector. We derive the Bayesian Fisher Information Matrix (FIM) for three types of receivers: (i) coherent receiver (ii) noncoherent receiver with known channel envelopes (iii) noncoherent receiver with known channel statistics only. We also derive the Weiss-Weinstein bound (WWB). We formulate two constrained optimization problems, namely maximizing trace and log-determinant of Bayesian FIM under network transmit power constraint, with sensors’ transmit powers being the optimization variables (we refer to as FIM-max schemes). We show that for coherent receiver, these problems are concave. However, for noncoherent receivers, they are not necessarily concave. The solution to the trace of Bayesian FIM maximization problem can be implemented in a distributed fashion, in the sense that each sensor calculates its own transmit power using its local parameters. On the other hand, the solution to the log-determinant of Bayesian FIM maximization problem cannot be implemented in a distributed fashion and the FC needs to find the powers (using parameters of all sensors) and inform the active sensors of their transmit powers. We numerically investigate how the FIM-max power allocation across sensors depends on the sensors’ observation qualities and physical layer parameters as well as the network transmit power constraint. Moreover, we evaluate the system performance in terms of MSE using the solutions of FIM-max schemes, and compare it with the solution obtained from minimizing the MSE of the LMMSE estimator (MSE-min scheme), and that of uniform power allocation. These comparisons illustrate that, although the WWB is tighter than the inverse of Bayesian FIM, it is still suitable to use FIM-max schemes, since the performance loss in terms of the MSE of the LMMSE estimator is not significant. Furthermore, comparing the performance of different receivers, our numerical results reveal that coherent receiver and noncoherent receiver with known channel statistics have the best and the worst performance, respectively.
Index Terms:
Bayesian Fisher information matrix, coherent versus noncoherent receiver, distributed estimation, Gaussian vector, LMMSE estimator, power allocation, multi-bit quantization,Weiss-Weinstein bound, classical Cramér-Rao bound, best linear unbiased estimator.
I Introduction
The plethora of wireless sensor network (WSN) applications, with practical constraints on network power and bandwidth raises a series of challenging technical problems for system-level engineers [3, 4]. One of these problems is bandwidth-constrained distributed parameter estimation problem, where geographically distributed battery-powered sensors are deployed over a sensing field to monitor physical or environmental conditions [5]. Each sensor makes a noisy observation of the unobservable parameter to be estimated, and transmits its locally processed observation to a fusion center (FC). The FC is tasked with estimating the unknown parameter, via fusing the received data from the sensors with the WSN.
In this work, we consider bandwidth-constrained distributed estimation of a Gaussian vector , where each sensor makes a scalar observation , with and being respectively, the observation vector and the scalar observation noise. We model the bandwidth constraint as limiting the number of quantization bits per observation period that a sensor can send to the FC. Each sensor applies a multi-bit scalar quantizer to quantize its observation, and maps it to a digitally modulated symbol. Sensors transmit their symbols to the FC over orthogonal power-constrained fading channels.
Bandwidth-constrained distributed estimation problem has a long and rich history in both signal processing and information theory literature. Depending on how the bandwidth constraint is modeled, these works can be classified into two classes: the works in the first class model the bandwidth constraint as limiting the number of quantization bits per observation period that a sensor can send to the FC. On the other hand, the works in the second class model the bandwidth constraint as limiting the number of real-valued messages per observation period that a sensor can send to the FC111In these works, each sensor makes a noisy observation vector of (the entire or part of) vector and locally compresses its observation vector. The focus in these works is finding the optimal compression matrices such that the mean square error (MSE) of reconstruction of at FC is minimized.[6, 7, 8, 9, 10]. While quantization is important in the works of the first class, compression is the critical component in the works of the second class. With respect to this classification, our work belongs to the first class.
The works in the first class mentioned above can be further categorized into several subclasses. The two most related subclasses to our work are the works that consider optimal quantization design strategies (dubbed subclass I) and the works that, given quantizers, optimize a network performance metric with respect to energy or power consumption during transmission (dubbed subclass II). Most of the works in subclass I assume that sensors’ quantized observations are sent over bandwidth constrained error-free communication channels. For example, [11, 12, 13, 14] studied this problem for estimating a deterministic scalar unknown parameter. The authors in [15, 16, 17, 18] studied this problem for erroneous bandwidth constrained channels. In particular, this problem was investigated for estimating a deterministic scalar in [15, 17, 18] and for estimating a zero-mean Gaussian scalar in [16]. When addressing the problem, these works have focused on the linear estimator at the fusion center (FC) and studied the MSE distortion pertaining to this linear estimator.
Among the works in subclass II, [15, 19] explored the optimal power allocation scheme that minimizes network transmission power subject to a target MSE constraint. On the contrary, for estimating a deterministic scalar [20, 21] minimized the MSE of the best linear unbiased estimator (BLUE) subject to a network transmit power constraint. The authors in [16, 22], proposed joint transmit power and rate allocation schemes for estimating a random scalar [16] and a random vector [22], where they minimized an upper bound on the MSE of the LMMSE estimator.
As an alternative to the MSE of the best linear estimator (BLUE and LMMSE for estimating deterministic and random unknowns, respectively), one can consider the Cramér-Rao bound (CRB) and its inverse Fisher information, which are widely employed to explore the fundamental limits of a parameter estimation problem, to optimize the power consumption of a resource constrained WSN tasked with distributed estimation. According to the Cramér-Rao inequality [23], maximizing Fisher information minimizes the CRB and Bayesian (classic) CRB sets a lower bound on the MSE of any Bayesian (unbiased) estimator [24]. Within the context of distributed estimation, maximizing Bayesian Fisher information has been adopted before to address sensor selection [25] and optimal quantization design [26, 27]. In particular, [25] investigated the optimal sensor activation strategy with linear observation model, via maximizing trace of Bayesian Fisher information matrix (FIM) subject to energy constraints. [26] derived the optimality conditions of quantizers that maximize the Bayesian Fisher information for conditionally independent and dependent observations. [27] studied the quantizer designs that minimize the MSE of minimum mean square error (MMSE) and maximum a posteriori (MAP) estimators, and compared their performances with the quantizer design that maximizes Fisher information. In [1][2], we presented our preliminary results on deriving Bayesian CRB and studied its behavior with respect to the system parameters for distributed estimation of a Gaussian vector with linear and nonlinear observation models.
Our Contributions: Considering the distributed estimation of a Gaussian vector with linear observation model [22, 28], we formulate two constrained optimization problems, namely, maximization of trace and log-determinant of Bayesian FIM, subject to network transmit power constraint, where sensors’ transmit powers are the optimization variables. We link log-determinant of Bayesian FIM to the mutual information between the unknown vector and its Bayesian estimator. We derive Bayesian FIM and the Weiss-Weinstein bound (WWB), which is known to be one of the tightest Bayesian bounds [29]. We develope two transmit power allocation schemes from solving the two formulated problems (which we refer to as FIM-max schemes). We derive the MSE corresponding to the LMMSE estimator at the FC for coherent and noncoherent receivers. Our numerical results demonstrate the effectiveness of FIM-max schemes, as these power allocations perform close to the power allocation obtained from minimizing the MSE of LMMSE estimator, and outperform uniform power allocation. Based on these results, we draw the conclusion that although the WWB is tighter than the Bayesian CRB in our problem (and Bayesian CRB is not attainable), it is still appropriate to use FIM-max schemes, since the performance loss in terms of the MSE of the LMMSE estimator is not significant.
Notations: Matrices are denoted by bold uppercase letters, vectors by bold lowercase letters, and scalars by normal letters. denotes the mathematical expectation operator, and represent the norm of a vector and the matrix-vector transpose operation, respectively. tr(.) and indicate trace and determinant of a matrix, respectively, and is the cardinality of set . () means that is a (semi-)positive definite matrix The definition of Q-function is , the Marcum-Q function of nonnegative real numbers and , denoted as , is defined as [30] , and the two dimensional Gaussian Q-function, denoted as , is defined as [31] . The notations and represent Gaussian distribution and complex Gaussian distribution, respectively.
II System Model and Problem Formulation
Suppose there are spatially-distributed and inhomogeneous sensors, each making a noisy observation of a common unobservable zero-mean Gaussian vector = with covariance matrix . Let denote the scalar noisy observation of sensor (see Fig. 1). Our linear observation model is:
[TABLE]
where is the known observation vector and denotes zero-mean Gaussian observation noise with variance . We assume that ’s are uncorrelated across the sensors and also are uncorrelated with . Sensor employs a scalar quantizer with quantization levels where is the index of the quantization level. In particular, the quantizer maps to one of the quantization levels as the following:
[TABLE]
where , are the quantization boundaries. Following quantization, sensor employs a fixed length encoder, which encodes the index corresponding to the quantization level to a binary sequence of length according to natural binary encoding222Natural binary encoding is needed for the derivations of Bayesian FIM. [16, 22], and finally modulates these bits into binary symbols. Let denote the average transmit power corresponding to symbols from sensor , which is equally distributed among symbols. We consider two types of modulators, Binary Phase Shift Keying (BPSK) modulator, which maps each bit of -bit sequence into one symbol with transmit power , and On-Off Keying (OOK) modulator, which maps each “1” bit of -bit sequence into one symbol with transmit power and sends no carrier for “0” bit.
Sensors send their modulated symbols to the FC over orthogonal flat fading channels, with fading coefficient . We assume that channel remains constant during the transmission of symbols. Denote as communication channel noise during the transmission of -th symbol of symbols corresponding to sensor . We assume ’s are independent across channels and independent and identically distributed (i.i.d.) across transmitted symbols, . We further assume that there is a constraint on the network average transmit power, i.e., .
To describe the estimation operation at the FC, let denote the recovered quantization level corresponding to sensor , where in general, due to communication channel errors. The FC processes the channel output corresponding to sensor to recover the transmitted quantization levels . We consider coherent and noncoherent receivers, corresponding to BPSK and OOK modulation schemes, respectively. For noncoherent receiver, we consider two scenarios: a) channel envelopes ’s are available at the FC [32], b) only statistics of complex Gaussian channel ’s are available at the FC [33]. Having , the FC applies a Bayesian estimator to form the estimate . We define vector which consists of transmitted quantization levels, and vector that includes recovered quantization levels at the FC. Let denote the joint probability distribution function (pdf) of the recovered quantization levels and the unknown vector . Under certain regularity conditions that are satisfied by Gaussian vectors, the Bayesian FIM, denoted as , is defined based on the joint pdf as [23, 24, 34]:
[TABLE]
where the expectation is taken over .
Our goals are to characterize and study the transmit power allocation schemes that maximize either tr() [25] or [35], subject to the network average transmit power constraint (which we refer to as FIM-max schemes). In other words, we are interested in solving the following constrained optimization problems333Let CRB denote the Bayesian CRB matrix. We have tr [22] and . Therefore, maximizing tr(J) is equivalent to minimizing the lower bound on tr and maximizing is equivalent to minimizing .:
[TABLE]
and
[TABLE]
Interestingly, the constrained maximization problem in (II) can be linked to the constrained maximization of mutual information between the unknown and its Bayesian estimator . Let , where is the corresponding estimation error vector. Suppose and are the error mean vector and the MSE matrix, respectively. According to inequality (6) in [24] and using the fact that is Gaussian, we can write:
[TABLE]
On the other hand, under the regularity conditions [23], the inverse of Bayesian FIM establishes a lower bound on the MSE matrix . The Bayesian Cramér-Rao inequality states that [23]. Using the concavity of the function log on the cone of positive definite Hermitian matrices [36], we conclude that . Therefore, the lower bound on is maximized if we substitute in (5) with . In other words:
[TABLE]
Based on (6), we observe that the problem in (II) is equivalent to constrained maximization of the mutual information lower bound.
III Characterization of Bayesian FIM
In this section, we characterize in terms of the optimization parameters . The matrix in (2) can be expressed as [24, 34]:
[TABLE]
where the first and second expectations are taken over the pdf of , denoted as and the conditional distribution , respectively. Using the Bayes’ rule , we can decompose into two terms:
[TABLE]
in which the outer expectations are taken over . The matrix only depends on () [24]. In particular, let denote the -th entry of matrix . We have [23]:
[TABLE]
Since is Gaussian with covariance matrix , we obtain . Let represent the -th entry of matrix . We can write [23]:
[TABLE]
We note that the entries depend on the parameters of the observation model as well as the physical layer parameters (e.g., modulation scheme, receiver type, channel gain, channel noise, transmit power, and quantization bits). To find in (8), we need Lemma 1 below, which shows that, given , the entries of vector are conditionally independent.
Lemma 1**.**
Given our system model we have . **
Proof.
See Appendix A-A. ∎
Combining the result of Lemma 1 and (8) and recalling that the expectation in (8) is taken with respect to , we reach:
[TABLE]
Using the following two facts:
[TABLE]
[TABLE]
where index indicates the quantization level corresponding to , we find that reduces to:
[TABLE]
Examining (10) we realize that we need to find two terms in order to fully characterize : the probability term , and its first derivative with respect to , i.e., . In the following, we derive these two terms. According to the Bayes’ rule and the fact that form a Markov chain, we have:
[TABLE]
Considering in (11) we realize that each term inside the sum is the product of two probabilities: the first probabilty does not depend on ; it depends on the modulation scheme (BPSK or OOK) and the receiver type at the FC (coherent or noncoherent) as well as the physical layer parameters, i.e., channel errors due to fading and noise, transmit power , and number of transmitted bits . On the other hand, the second probability depends on , the observation model and its parameters as well as quantizer. In other words, the contributions of the observation model and quantization in each term inside the sum in (11) are decoupled from those of communication system.
The probability in (11) becomes:
[TABLE]
in which () follows from the fact that the conditional pdf of given is .
Next, we find in (10). Since does not depend on , from (11) we have:
[TABLE]
Now we characterize in . As we mentioned before, depends on the modulation scheme and the receiver type at the FC. In this section we derive for BPSK modulation with coherent receiver and OOK modulation with noncoherent receiver. For OOK modulation with noncoherent receiver, we consider two scenarios: a) channel envelopes are available at the FC, b) only channel statistics are available at the FC. We assume that the FC performs a symbol-by-symbol demodulation. To enable derivations of , we let indices and , respectively, indicate the quantization levels corresponding to and , and and , respectively, be the transmitted bit sequence and recovered (received) bit sequence of sensor .
III-A Coherent Receiver
Suppose the Hamming distance between two bit sequences and is , in which is the Boolean sum operator. We define as the channel signal to noise ratio (SNR) of sensor , where:
[TABLE]
We can model the channel between sensor and the FC as a binary symmetric channel (BSC) with the probability of flipping a bit , where does not depend on the bit index. Hence, the probability in (11) becomes:
[TABLE]
III-B Noncoherent Receiver
The channel between sensor and the FC can no longer be modeled as a BSC. Instead, we can model it as a binary asymmetric channel, where is the probability that “0” bit is flipped into “1” bit, and is the probability that “1” bit is flipped into “0” bit. Therefore, the probability in (11) becomes:
[TABLE]
where is indicator function with subscript describing the event of inclusion. Next, we compute probabilities and in (III-B). Note that and do not depend on the bit index. The problem of demodulating symbols (bits) sent by sensor , based on received signals, can be cast into binary hypothesis testing problems, in which the channel output corresponding to each problem is:
[TABLE]
for , where is transmitted signal amplitude for sensor . Denoting as the test statistics, the optimal likelihood ratio test (LRT) at the FC can be expressed as:
[TABLE]
where the probabilities and . Lemma 2 shows that for our system model, .
Lemma 2**.**
We have under the following two assumptions:
- the pdf of noisy observation is smooth and symmetric,
- sensor uses a symmetric mid-rise quantizer and encodes the quantization level according to natural binary encoding rule. Both assumptions hold true for our system model.**
Proof. See Appendix A-B.
According to Lemma 2, we can state that , where is the average transmit power of sensor . In the following, we find probabilities and for our two types of noncoherent receivers.
Noncoherent Receiver with Known Channel Envelopes: For this receiver, the test statistics of LRT at the FC is the envelope of channel output, i.e., and is known to the FC. Hence, given , the two conditional pdfs of the test statistics under hypotheses and are [37]:
[TABLE]
[TABLE]
where is defined in (14) and is the zeroth-order modified Bessel function of the first kind. Since ’s are independent across transmitted symbols, the random variables conditioned on each hypothesis and are i.i.d. for . Therefore, the probabilities and do not depend on bit index . Based on equations (7-4-7) and (7-4-11) in [37], probabilities and are:
[TABLE]
where the decision threshold depends on and . For , [37] provides an accurate approximation of as .
Finally, by substituting (18) in (III-B), we compute for noncoherent receiver with known channel envelopes.
Noncoherent Receiver with Known Channel Statistics: For this receiver, the test statistics of LRT at the FC is the power of channel output, i.e., . The FC only knows the channel statistics . Let denote the average channel SNR of sensor , where:
[TABLE]
in which we have used the knowledge of channel statistics to obtain . Since is complex Gaussian, we have [33]:
[TABLE]
[TABLE]
Note that ’s conditioned on each hypothesis are i.i.d. for and therefore the probabilities and do not depend on bit index . Hence:
[TABLE]
in which the decision threshold for is .
Finally, by substituting (20) in (III-B), we compute for noncoherent receiver with known channel statistics444 When the ratio in (17), the expressions for the decision threshold change. For noncoherent receiver with known channel envelopes, one can analytically find for each the value of which minimizes the average error probability corresponding to demodulating the symbols of sensor given as . Equivalently, satisfies . For noncoherent receiver with known channel statistics, we obtain . .
III-C *Finding Bayesian FIM in (III) *
At this point, we have all the components to write the entries in (8). Combining (10)-(13), we find the following compact form representation of :
[TABLE]
where the scalar is:
[TABLE]
Finally, we compute and substitute it in (III) to obtain matrix as:
[TABLE]
where the columns of are observation vectors in (1) and the expectations over in (23) are computed using numerical integration.
For in (23) there exists two baselines. For the first baseline, suppose all sensors’ observations ’s are available at the FC with full precision (centralized estimation) and let be the corresponding Bayesian FIM. To find , we start from (8) and replace with . Following the same procedure as we described to obtain (10) from (8), we reach:
[TABLE]
Since , it is straightforward to show . Therefore:
[TABLE]
For the second baseline, suppose communication channels between sensors and the FC are error-free and hence vector is available at the FC. Let be the corresponding Bayesian FIM. To find for entries using (21) we note that for and otherwise, since the channel error probabilities ( for coherent receiver, for noncoherent receivers) are zero. Therefore, from (21) we find . Clearly, .
Remark 1**.**
If has a known nonzero-mean , sensor subtracts from its observation , before quantization. At the FC, is first added to to generate and then the Bayesian estimator is formed using . Thus, the corresponding Bayesian FIM matrix becomes:
[TABLE]
where the joint pdf . Noting that , we follow the same procedure as we conducted before to obtain in (23) and we find that has the same expression as with the only difference that for nonzero-mean .**
IV WWB Bound: Derivation and Computation
The MSE matrix of any Bayesian estimator of random vector satisfies the following inequality [29, 38]:
[TABLE]
where the columns of matrix , so-called test points, lie in the parameter space and their choices are left to the user [29, 38]. The matrix is defined by its entries , which are computed as follows [29]:
[TABLE]
The inequality in (25) holds for any such that in invertible [29, 38]. Maximizing the right side of (25) with respect to leads to the tightest WWB, denoted as . In other words:
[TABLE]
where the supremum operation is taken with respect to Loewner partial ordering [38]. To find in our problem, first we need to derive the entries , or equivalently scalar in (26). After deriving , we discuss how to compute the supremum in (27).
IV-A Deriving in (26) Based on Our System Model
Using equation (43) in [29] and the Bayes’ rule to write and we find:
[TABLE]
where denotes the -dimensional volume over which we take integral and is the square root of the joint pdf. To characterize in (IV-A) we need to find , , and . Let index indicate the quantization level corresponding to . According to Lemma 1, the followings are evident:
[TABLE]
where is given in (11), and can be computed with a simple substitution of by in (11). Moreover, some easy manipulations yield:
[TABLE]
Substituting (29) and (30) in (IV-A) and some straightforward manipulations produce:
[TABLE]
where .
IV-B Computation of the Tightest WWB
In the following, we explain how we compute the supremum in (27). We note that the method to compute the supremum in (27) does not depend on the system model (it only depends on the parameter space). Therefore, we adopt the same method as in [38]. Let and define set:
[TABLE]
Then is the supremum of set , where the supremum operation is taken with respect to Loewner partial ordering [38]. It is worth mentioning the difference between the maximum and the supremum of the set . The largest element of , if it exists, is defined as . On the other hand, the supremum of is a minimal-upper bound on that is not necessarily contained in . This implies that the largest element of may not exist, but if it exists, it is also the supremum.
According to Lemma 3 of [39] for any two positive definite matrices and we have if and only if , in which the hyper-ellipsoid centered at the origin can be represented by the set . Consequently, the supremum in (27) can be computed by finding the minimum volume hyper-ellipsoid containing the set , where the set itself consists of the hyper-ellipsoids generated by all matrices in . The problem of finding the minimum volume ellipsoid that contains the ellipsoids (and therefore the convex hull of their union) has been formulated as a convex problem in [40]:
[TABLE]
where and is the cardinality of the set . This problem can be solved efficiently using semidefinite programming. In particular, we solve this problem using CVX.
V Power Constrained Bayesian Fisher Information Maximization
In this section, we address the constrained optimization problems formulated in (II) and (II). We denote the solutions obtained from solving these two power constrained Fisher information maximization problems as FIM-max schemes. Note that due to the cap on the network average transmit power, only a subset of the sensors might be active during each task period, which we refer to as the set of active sensors .
V-A Solving Optimization Problem in (II)
We adopt the Lagrange multipliers method to solve the problem . The Lagrangian of this problem is:
[TABLE]
The Karush-Kuhn-Tucker (KKT) optimality conditions are:
[TABLE]
where ’s are the Lagrange multipliers. According to (23) we find:
[TABLE]
Thus, to show , we need to show . Although we were not able to prove analytically, our extensive simulations for various system parameters indicate that and thus . Fig. 2 summarizes our extensive simulations to demonstrate , for coherent receiver. To obtain this figure, we let and consider a zero-mean Gaussian vector with . We assume , , and vary , , and use the uniform quantizer described in Section IX. Let . For coherent receiver, Fig. 2(a) and Fig. 2(b) depict versus for different values of and , respectively. We observe that, for all different values of and , we have , . Similar observations were made for both types of noncoherent receivers. However, due to lack of space we have omitted those plots.
Since tr is an increasing function of ’s, the Lagrange multiplier in (32) should be determined such that it satisfies the network average transmit power constraint with equality, that is, . Furthermore, for the set of active sensors the Lagrange multiplier . Hence, we can reformulate the KKT optimality conditions in (32) as:
[TABLE]
Let be the vector of sensors’ transmit powers. The Hessian of with respect to is a diagonal matrix, since using (33) we find . Fig. 3(a) and Fig. 3(b) depict versus for different values of and , respectively, for coherent receiver, showing that , which implies the Hessian matrix is negative definite. The negative definiteness of the Hessian matrix means that is jointly concave over ’s. Moreover, the constraints are linear, and thus, the problem in (II) is concave. For noncoherent receivers, unlike coherent receiver, our simulations show that the sign of for various system parameters changes, and thus, is not necessarily a concave function over ’s. The optimal solutions for and for cannot be obtained in closed-form expressions. Therefore, we resort to Newton-Raphson algorithm to solve the set of nonlinear equations in (V-A). For coherent receiver, since the problem is concave, it is guaranteed that the numerical solution obtained via the algorithm is globally optimal. Therefore, only one (carefully chosen) initial point suffices to run the algorithm. However, for noncoherent receivers, since the problem is not concave, we consider multiple initial points to run the algorithm. The description of this algorithm for noncoherent receivers follows.
Let be the vector that contains the vector of sensors’ transmit powers as well as the Lagrange multiplier . We let and , respectively be the gradient vector and the Jacobian matrix of the right side of the equality in (31) with respect to . We have:
[TABLE]
Let be the total number of initial points. We choose initial points (solutions), where is the index of the initial points. The Newton-Raphson algorithm is carried out to obtain and , which respectively are the final solution and the final value of the objective function obtained when the algorithm terminates, corresponding to the initial point . Suppose the algorithm runs for the initial point . We initialize the iteration index and the initial point . We denote as the solution at -th iteration, and , , respectively, as the gradient vector and the Jacobian matrix evaluated at . At iteration , if the Jacobian matrix becomes singular, or , the algorithm terminates. Otherwise, we let . As the stopping criterion, we check whether , where is a predetermined error tolerance, or whether the number of iterations exceeds a predetermined maximum . Let be the optimal solution to this constrained optimization problem. After finding all , is associated with the largest value among .
V-B Solving Optimization Problem in (II)
We follow the same procedure as we described in Section V-A to solve (II). Specifically, we have:
[TABLE]
where we have used (23) and the fact to reach (36). Since and we conclude and thus is an increasing function of ’s. The Lagrangian of this problem is . The corresponding KKT optimality conditions are:
[TABLE]
For coherent receiver our simulations show that the Hessian of with respect to is diagonal and negative definite matrix, and thus, is jointly concave function over ’s. However, for noncoherent receivers the sign of varies for different system parameters and hence is not necessarily concave function of ’s. We employ Newton-Raphson algorithm with multiple initial points as we described in Section V-A to solve the set of equations in (V-B). A remark on the difference between power allocation schemes based on maximization of tr and follows.
Remark 2**.**
Regarding the solution of (V-A) on constrained maximization of tr, we note that is common and fixed for all active sensors and thus this power allocation scheme can be implemented in a distributed fashion, i.e., the FC sends to the set of active sensors and each sensor calculates its own power using its local parameters. Unlike the solution of (V-A), the solution of (V-B) on constrained maximization of log cannot be implemented in a distributed fashion. In other words, the FC needs to find and informs the active sensors of their transmit powers.**
VI LMMSE Estimator and its MSE
Given , finding the optimal MMSE estimate of in a closed form is mathematically intractable, since it requires dimensional integrals that cannot be simplified. To curb computational complexity, we assume that the FC employs the LMMSE estimator to process and forms the estimate . We derive the LMMSE estimator and its corresponding MSE matrix . Let vector . We have:
[TABLE]
Since is zero-mean, we obtain . The -th column of the cross-covariance matrix describes the correlation between and . Using the Bayes’ rule we obtain:
[TABLE]
[TABLE]
where denotes the -dimensional volume over which we take integral, and in the first equality we have used the fact that , , form a Markov chain and thus, given , and are conditionally independent. Since and , we reach:
[TABLE]
and the expression for vector is given in (42). By definition, the -th entry of matrix is:
[TABLE]
Similar to what we did in (39), to obtain and the diagonal entries of (i.e., ), we condition on ; however, for the non-diagonal entries of (i.e., ), we condition on . Then using (11), we obtain:
[TABLE]
where and are scalars. We find these integrals (see Appendix A-C for derivations) as below:
[TABLE]
in which:
[TABLE]
Substituting (39)-(43) in (VI), the MSE matrix is computed.
For in (VI) there exists two baselines. For the first baseline, we consider the centralized estimation case in Section III-C with the LMMSE estimator at the FC and let denote the corresponding MSE matrix. We have:
[TABLE]
where and respectively are, auto-covariance matrix of noisy observations, and cross-covariance matrix between and . For linear observation model in (1) we get:
[TABLE]
For the second baseline, suppose communication channels between sensors and the FC are error-free and hence vector is available at the FC. Let vector . Then, the corresponding MSE matrix is . Since is zero-mean, we obtain . We let and , respectively, be the -th column of matrix , and the -th entry of matrix . Taking steps similar to the ones we took to obtain (39)-(41), we find , , , in which for , and for . Clearly, .
Remark 3**.**
If has a known nonzero-mean , the expressions for the LMMSE estimator and its corresponding MSE matrix change as the following:
[TABLE]
where .**
VII Discussion on Appropriateness and Achievability of Bayesian CRB
One may wonder how the FIM-max schemes in Section V are compared with the power allocation that can be obtained from constrained minimization of the MSE of the LMMSE estimator derived in Section VI. On the other hand, the literature [29] suggests that the WWB in Section IV is a tighter bound (compared to Bayesian CRB). This observation raises the question whether using the WWB as the optimization metric would be a more appropriate choice. This section provides answers to these questions.
VII-A Appropriateness of Bayesian FIM as the Optimization Metric
Let , where is the MSE matrix of the LMMSE estimator given in (VI). We consider the following constrained optimization problem:
[TABLE]
In the absence of analytical solution, we resort to exhaustive search method to find the solution of the problem in (VII-A). Let MSE-min scheme corresponds to this solution. For all three types of receivers, our extensive simulations show that , however, the sign of for various system parameters changes, and hence, tr is not necessarily a convex function over ’s. Furthermore, the cost function in (VII-A) cannot be decoupled over the optimization variables ’s and thus ’s across sensors are related to each other. Because of this, finding MSE-min is computationally complex, and the solution cannot be implemented in a distributed fashion (i.e., sensor cannot find relying on its own local information only). This contrasts FIM-max scheme obtained from solving the problem in (II), where the cost function in (II) can be decoupled over ’s and thus ’s across sensors are not related to each other. Because of this, finding FIM-max is computationally simple, and the solution can be implemented in a distributed fashion. Figures 8 and 9 in Section IX illustrate the numerical evaluations of (i) trace of at power allocation obtained from solving the problem in (VII-A), denoted as and (ii) trace of at power allocation obtained from solving the problem in (II), denoted as , given . The figures show that:
[TABLE]
where means that is less than , but very close to . Obviously, from the estimation theory we know . What our numerical results reveal is that in our problem they are very close to each other. This indicates the appropriateness of using Bayesian FIM as the optimization metric, since the loss in terms of the MSE performance is not significant.
VII-B Tightness and Achievability of Bayesian CRB
Although the WWB is a tighter bound (compared to Bayesian CRB)[29], we note that finding the WWB matrix is computationally much more expensive (compared to finding the Bayesian FIM), due to required matrix inversions for each test point in (27). Consequently, finding the power allocation that minimizes the trace or log-determinant of the WWB is computationally much more expensive than finding the solutions for the problems in (II) or (II). Furthermore, (46) indicates that by not using power allocation obtained from minimizing trace of the WWB matrix (which is tighter than Bayesian CRB) we are not in disadvantage, in terms of the MSE performance.
According to [23] Bayesian CRB is attainable if and only if the posterior probability density of given “observation” is Gaussian. In that case, the MMSE and MAP estimators coincide and both are efficient (i.e., their MSE matrices are equal to Bayesian CRB matrix) [23]. This bound is attained in the limit as becomes infinite [29]. In our work, the recovered quantization levels for all sensors at the FC, denoted as vector , plays the role of “observation”. Since the posterior probability density of given is not Gaussian, Bayesian CRB is not attainable. However, as increases, we expect that the MSE of MMSE estimator approaches to Bayesian CRB. Let \mbox{{\mbox{tr}}}(\text{CRB(FIM-max)}) denote trace of Bayesian CRB matrix evaluated at FIM-max power allocation, and Let \mbox{{\mbox{tr}}}(\text{CRB(MSE-min)}) denote trace of Bayesian CRB matrix evaluated at MSE-min power allocation. From the estimation theory we know:
[TABLE]
Combining (47) and (46) we reach:
[TABLE]
This suggests that, although Bayesian CRB is not attainable, it is still proper to use Bayesian FIM for transmit power optimization, since the loss in terms of the MSE performance is not significant.
VIII Classical CRB and BLUE for Estimating Deterministic Vector
In this section, we derive the classical FIM (assuming vector to be estimated is deterministic), the BLUE and its corresponding MSE matrix. We also discuss the behavior of the classical FIM and the MSE of BLUE in low-region and high-region of . Finally, we discuss optimizing transmit power considering the classical FIM and the MSE of BLUE as the optimization metric.
VIII-A Characterization of Classical FIM
Let denote the classical FIM and represents the -th entry of . We have [23]:
[TABLE]
where is the joint probability distribution of parameterized by . Notice that in (48) is similar to in (8), with the difference that for Bayesian FIM we deal with the conditional pdf . Therefore, has the same expression as in (23), which depends on . That is:
[TABLE]
in which is defined in (22), and the probabilities and have the same expressions as for Bayesian FIM.
VIII-B Characterization of BLUE and its MSE Matrix
Recall is the data at the FC based on which we wish to form the BLUE. To satisfy the unbiasedness requirement for BLUE, we need to have , for a known matrix [41]. The unbiasedness requirement is not satisfied in general for our system model. However, under three conditions (coherent receiver at the FC, uniform quantizer555For sensor , we define the quantization noise . Since ’s in (1) are uncorrelated Gaussian, ’s are uncorrelated Gaussian. [42] shows that when uncorrelated Gaussian are quantized with uniform quantizers of quantization step sizes ’s, ’s are independent zero mean uniform random variables with variance . Also, ’s and ’s are uncorrelated., and natural binary encoder at the sensors to map quantization levels to information bits), we can establish a linear relationship between and , that is , where is a zero-mean vector with covariance , and show that for this linear model the unbiasedness requirement is met, i.e., . Then using this linear model, we derive BLUE and its corresponding MSE matrix as the following [41]:
[TABLE]
First we verify the unbiasedness requirement under the three stated condition. Under these three conditions, we can use the approximations given in [43] and write:
[TABLE]
Equation (VIII-B) shows that the unbiasedness constraint is satisfied. Next, we establish the linear relationship , where is a zero-mean vector with covariance , and we find . Knowing and we can then use (VIII-B) to express BLUE and its corresponding MSE. To establish the linear relationship, suppose:
[TABLE]
where is zero-mean with variance . The equivalent vector-matrix representation of (52) becomes , in which , , and denotes the covariance matrix of vector . Hence, to find we need to find . Let be the -th entry of matrix . Starting with the diagonal entries of , we find . Under the three stated conditions, we can use the approximations given in [43] and write:
[TABLE]
where . Next, we compute the non-diagonal elements , where the mean is given in (VIII-B). Hence, we need to find as the following:
[TABLE]
in which () follows from the fact that, given , then are independent, () comes from (VIII-B), () is obtained from the fact that the quantization noises ’s are uncorrelated from each other, and ’s and ’s are uncorrelated, and () follows from (VIII-B). Recall according to (VIII-B) . Let be a diagonal matrix. Clearly by the construction of we have and thus . Replacing with its upper bound and substituting in (VIII-B), we find quasi BLUE and its corresponding MSE matrix as shown in (54). The notion of quasi BLUE in the context of distributed estimation of an unknown deterministic scalar has been used before in [15, 20], where an upper bound on the variance of the data at the FC (based on which BLUE is formed) is utilized, instead of the variance of the data itself, to derive the unbiased estimator and its corresponding MSE.
VIII-C Behavior of the classical FIM and the MSE of BLUE in low-region and high-region of
Consider coherent receiver where we model the channel between sensor and the FC as a BSC with the probability of flipping a bit and , defined in (14), depends on . In low-region of (when ) we have (worst communication channel effect). Then (15) implies that and one can show that . Therefore . On the contrary, in high-region of (when ) we have . This implies that
[TABLE]
Then one can show that and , where is given in Section III-C and is obtained from (49) after substituting with . Similar discussions can be made and similar conclusions can be reached for both types of noncoherent receivers. For coherent receiver in low-region of (when ) we have . Examining (54) we realize that this implies . On the contrary, in high-region of (when ) we have and , where denotes when communication channels between sensors and the FC are error free.
VIII-D Transmit Power Optimization Using MSE of Quasi BLUE and Classical FIM
One can consider the following constrained transmit power optimization problem, where trace (or log-determinant) of is minimized, subject to the network transmit power constraint as follows:
[TABLE]
It is straightforward to show . This implies tr is a decreasing function of ’s and the constraint holds with equality. Furthermore, we have , implying that the Hessian is a positive definite matrix and is jointly convex over ’s. Moreover, the constraints are linear, and thus, the problem in (55) is convex. We could not find a closed-form solution for ’s. One needs to solve (55) numerically to find the optimal ’s. Since the problem is convex, it is guaranteed that the numerical solution (obtained via the numerical search algorithm) is globally optimal. Since the cost function in (55) can be decoupled over ’s the solution can be implemented in a distributed fashion.
On the other hand, a constrained optimization problem based on maximizing tarce (or log-determinant) of classical FIM in (49) is not meaningful, since depends on and thus the power allocation is not realizable.
IX Numerical Results
In this section through simulations we corroborate our analytical results. Our analytical results are valid as long as sensors use symmetric mid-rise quantizers. We consider uniform quantizer [16, 22, 28], and Lloyd-Max quantizer [44]. For the uniform quantizer, quantization levels are for and quantization boundaries are for , where denotes the quantization step size. Similar to [16], we assume lies in the interval with a high probability for some reasonably large666Consider quantizing a zero-mean Gaussian . For we have and for we have , where is the cumulative distribution function of the standard Gaussian random variable. On the other hand, can be decided by the sensor’s sensing dynamic range, considering its hardware limitation and sensing capability [15]. , i.e., . To this end, we assume where is defined in (43). Hence, we choose [16, 22]. For the Lloyd-Max quantizer, quantization levels are for and quantization boundaries are for that can be found via iterative design.
IX-A Comparison of WWB, Bayesian CRB, and MSE of LMMSE Estimator
We numerically compare traces of the MSE matrix of LMMSE estimator, the WWB matrix and the Bayesian CRB matrix in Fig. 4 for various , assuming is uniformly distributed among sensors, and uniform quantization and coherent receiver are employed. The figure suggests that the WWB is a tighter bound, compared to the Bayesian CRB. Similar observations can be made for two types of noncoherent receivers, and also when we compare the determinant of these three matrices. Due to lack of space, we have omitted those plots.
IX-B Behavior of tr and in terms of and Quantizer
Without loss of generality and for the simplicity of presentation, we let and consider a zero-mean Gaussian vector with . We assume , , bits, .
Assuming , Fig. 5 depicts tr and versus for coherent receiver, considering both uniform and Lloyd-Max quantizers. Fig. 5 shows as increases, both metrics increase and asymptotically approach their corresponding baseline (i.e., centralized estimation when full precision observations are used to derive Bayesian FIM and form ). There is also a gap between each metric and its corresponding baseline, which is due to quantization. Note that this gap for Lloyd-Max quantizer is smaller than that of uniform quantizer. Comparing Lloyd-Max and uniform quantizers, we observe that when is less than a certain threshold (which depends on the network setup parameters), the latter slightly outperforms the former, and when is greater than the threshold, the former outperforms the latter. As increases, this threshold becomes larger and the performance of both quantizers get closer to each other. The behaviors of tr and for noncoherent receivers are the same as those of coherent receiver, hence are omitted due to lack of space. Regarding the behaviors of the two metrics with respect to the observation model parameters, we state that tr and increase as the variance of observation noise decreases.
IX-C FIM-max vs. Uniform Power Allocation
We investigate how the behavior of tr changes as communication channel and observation model parameters vary. Let . For coherent receiver, Fig. 6(a) plots tr evaluated at the corresponding optimal power allocation (i.e., ’s are the solutions of the problem in (II)) versus , for both uniform and Lloyd-Max quantizers, when , dB, dB.
Fig. 6(b) plots the same, with the difference that , dB. To demonstrate the effectiveness of the proposed FIM-max schemes, we also include tr evaluated at uniform power allocation in these figures. Overall, Fig. 6(a), Fig. 6(b) show that for coherent receiver the proposed FIM-max schemes outperform uniform power allocation, for both quantizers and for all ranges of . Moreover, it is evident that Lloyd-Max quantizer outperforms uniform quantizer in moderate-region to high-region of . Similar observations can be made for two types of noncoherent receivers, and also when the optimization metric is (i.e., ’s are the solutions of the problem in (II)). Due to lack of space, we have omitted those plots. Comparing three types of receivers, our simulations demonstrate that for a given , coherent receiver and noncoherent receiver with known channel statistics have the best and the worst performance in terms of tr and .
IX-D Behavior of FIM-max Power Allocation Across Sensors
We study the behavior of the FIM-max power allocation across sensors as increases. Recall . We let , , , , bits, . Fig. 7 illustrates versus for coherent receiver, where ’s are the solutions of the problem in (II), for both uniform and Lloyd-Max quantizers. Regarding Fig. 7 we make the following four observations: 1) increases as increases, 2) the power allocations obtained for Lloyd-Max quantizer are very close to those obtained for uniform quantizer, 3) when is small, only sensor 1 is active, and as increases, sensors 2 and 3 become active in a sequential order, 4) in low-region of , a sensor with a larger is allotted a larger (water filling), and in high-region of , a sensor with a smaller is allotted a larger (inverse of water filling). Although we don’t have a closed-form solution for ’s, our conjecture is that its change of behavior in terms of , can be explained by examining the ’s solution provided in [22], where the authors have considered a related problem. In particular, [22] considered minimizing an upper bound on the MSE of LMMSE estimator, subject to a network transmit power constraint, given quantization bits. For coherent receiver, based on the closed-form solutions of ’s the authors in [22] found the following:
[TABLE]
Equation (58) shows that the behavior of ’s can change, depending on whether is larger or smaller than the threshold . The parameter in (58) depends on the observation vectors and quantization. The optimal value of Lagrange multiplier in (58) is related to according to where are common terms among sensors. Revisiting the results in [22], now we return to Fig. 7. Given the observation vectors and quantization (given ) and given , suppose increases. Increasing implies that and thus the thresholds ’s decrease. Therefore, ’s are being compared against smaller thresholds ’s. In high-region of the thresholds ’s are so small that each exceeds (all channels can be viewed as “strong”). In this case, the allocation of power among sensors is such that, if then (the sensor with a less stronger channel is allocated more transmit power). In contrary, given and given suppose decreases. Decreasing implies that and thus the thresholds ’s increase. Hence, ’s are being compared against larger thresholds ’s. In low-region of the thresholds ’s are so large that each is below (all channels can be viewed as “weak”). In this case, the allocation of power among sensors is such that, if then (the sensor with a less weaker channel is allocated more transmit power).
Note that the behavior of ’s as the solutions of the problem in (II) with respect to is analogous to that depicted in Fig. 7. Moreover, the behavior of ’s for two types of noncoherent receivers are similar to that of coherent receiver. Due to lack of space, we have omitted those plots.
IX-E FIM-max vs. MSE-min Power Allocation
We explore how the FIM-max schemes are compared with the power allocation that can be obtained from constrained minimization of the MSE of the LMMSE estimator derived in Section VI. Let and , denote trace of at ’s obtained from solving the problem in (II) and uniform power allocation, respectively. Fig. 8(a) and Fig. 8(b) illustrate the numerical evaluations of , defined in Section VII-A, as well as , , versus for coherent receiver and two types of noncoherent receivers, respectively, and for the same setup parameters as Fig. 6(a). To fairly compare the performance of different receivers, we obtain the numerical results for coherent receiver and noncoherent receiver with known channel envelopes by taking expectation over fading channel envelope vector , such that . Fig. 8(c) and Fig. 8(d) plot the same as Fig. 8(a) and Fig. 8(b), with different setup parameters though (the same parameters as Fig. 6(b)). These figures show {\cal D}_{m}\leq{\cal D}_{l}{\color[rgb]{0.2,0.3,0.8}\approx}{\cal D}_{t}\leq{\cal D}_{unif} for all three receivers and all ranges of , i.e., performance of both FIM-max schemes are very close to that of MSE-min scheme (when we average over ). We also plot tr(CRB(tr-FIM-max)) versus for coherent receiver. Fig. 8(a) and Fig. 8(c) illustrate the inequality \mbox{{\mbox{tr}}}(\text{CRB}({\text{FIM-max}}))\!<\!{\cal D}_{m} in (47). The same observation is made for two types of noncoherent receivers. Due to lack of space, these plots are omitted.
It is worth mentioning that from the estimation theory we know and . What our simulations suggest is that in our problem they are indeed very close to each other. This observation is very important since it indicates that, although Bayesian CRB is not attainable in our problem and the WWB is tighter than Bayesian CRB, it is still proper to use FIM-max power allocation (instead of power allocation that minimizes the WWB or the MSE of the LMMSE estimator), since the differences and are small and not significant. While in low-region and high-region of , and are much closer to , in moderate-region of , there is a small gap between them. Comparing three types of receivers for a given , coherent receiver and noncoherent receiver with known channel statistics have the best and the worst performance. Similar observations can be made for Lloyd-Max quantizers. Due to lack of space we have omitted those plots.
IX-F Estimation Performance of a Randomly Deployed Network
We investigate the impact of network size on the MSE performance and compare tr(MSE) that is evaluated at different transmit power allocation. We assume sensors are randomly deployed in a field, where the origin is the center of the field, and compare the numerical results with sensors. We consider a zero-mean Gaussian vector with . The distance between each external signal source located at and sensor located at is:
[TABLE]
Let be the distance of source from the origin. Without loss of generality, we assume . To characterize the observation gain vectors in (1) we adopt an isotropic intensity attenuation model, where and is the signal decay exponent which is approximately 2 for distances [45]. We assume . For coherent receiver and noncoherent receiver with known channel envelopes we let , and for noncoherent receiver with known channel statistics we let .
Fig. 9 plots versus for coherent receiver, and both uniform and Lloyd-Max quantizers. Fig. 9 demonstrates the superiority of FIM-max schemes, compared to uniform power allocation for all ranges of . Furthermore, the observation , suggests that log-det-FIM-max power allocation is closer to MSE-min power allocation, compared to tr-FIM-max power allocation (for a given realization of ). This is intuitively appealing, since the Bayesian FIM is not a diagonal matrix and log-det-FIM-max power allocation extracts and utilizes more information from , compared to tr-FIM-max power allocation. Similar observations can be made for two types of noncoherent receivers. Due to lack of space we have omitted those plots.
X Conclusions
We derived the Bayesian FIM and the WWB for distributed estimation of a Gaussian vector, when sensors transmit their digitally modulated quantized observations to the FC over power-constrained orthogonal noisy fading channels. We formulated and addressed constrained maximization of tr and log under the constraint on . We also derived the LMMSE estimator and its corresponding MSE. Through simulations we observed that both tr and increase as increases. Regarding the solutions of the formulated constrained maximization problems, we noticed that in low-region and high-region of , is alloted among sensors in a water filling and inverse of water filling fashion, respectively. We also considered the power allocation solution obtained from minimizing the MSE of the LMMSE estimator (MSE-min scheme). Numerical results demonstrated the effectiveness of FIM-max schemes for different network setup parameters, as the MSE associated with FIM-max schemes are very close to that of MSE-min scheme and outperform that of uniform power allocation in all simulation scenarios. These suggest that, although the WWB is tighter than the Bayesian CRB in our problem (and Bayesian CRB is not attainable), it is still appropriate to use FIM-max schemes, since the performance loss in terms of the MSE of the LMMSE estimator is not significant. Comparing the performance of three types of receivers, our numerical results revealed that coherent receiver and noncoherent receiver with known channel statistics have the best and the worst performance, respectively. Comparing uniform and Lloyd-Max quantizers, we observed that the latter outperforms the former in moderate-region to high-region of for all receivers.
Appendix A Appendix
A-A Proof of Lemma 1
By using the Bayes’ rule, we have:
[TABLE]
Since the communication channels are orthogonal and communication channel noises are independent, we can write:
[TABLE]
Moreover, given , depends on communication channel noises and depends on observation noises. However, observation and channel noises are two independent random processes. Hence, given , and are conditionally independent. That is, , and form a Markov chain777We say that random variables form a Markov chain, denoted by , if Markov property holds [36]. and we conclude:
[TABLE]
Combining (60) and (61), in (59) becomes:
[TABLE]
Let be the observation vector. Since Gaussian observation noises ’s are uncorrelated across the sensors and also uncorrelated with Gaussian , we have . This implies:
[TABLE]
Substituting (62) and (63) in (59), we reach (64) bellow in which () is obtained from some straightforward mathematical manipulations and is obtained using the Bayes’ rule and the fact that form a Markov chain.
A-B Proof of Lemma 2
Given the assumptions made in lemma 2 and the number of quantization bits , Fig. 10 illustrates how the noisy observation is quantized and encoded. Define , where ’s are the quantization boundaries specified in Section II. Since has a symmetric pdf and the quantizer is symmetric, we have:
[TABLE]
Define as the number of ones in encoded quantization index . When the quantization indices are encoded using natural binary coding we can show that . Therefore, the prior probability can be computed as:
[TABLE]
Similarly, we can show that .
A-C Calculation of in (39), and and in (41)
We first calculate . We consider the eigenvalue decomposition of where , . We define and therefore [46], and also in which is sensor observation gain vector. Using these definitions and changes of variables along with the definition of in (III), becomes:
[TABLE]
where and denotes the -dimensional volume over which we take integral in the new coordinate. After expanding the argument of exponential function of the integrand and using completing square, and defining:
[TABLE]
can be obtained as in (65), in which , and for the second equality, we have used the fact that integral of pdf of Gaussian random vector over is equal to 1. The term in the denominator of is absorbed in the integration over , because the effects of change of variable from to on to and to cancel each other. Since \left|{\boldsymbol{Q}}_{k}\right|\!=\!1\big{/}\left|{\boldsymbol{Q}}_{k}^{-1}\right|, using the Matrix Determinant Lemma which performs a rank-1 update to a determinant [47], we obtain:
[TABLE]
and therefore . One can also use the Binomial Inversion Lemma [47] to compute in (66) as:
[TABLE]
Substituting (67) in (66) and (65), we obtain:
[TABLE]
From the definition of we have . Having from (43), we conclude:
[TABLE]
Taking a similar approach, we can calculate in (39) and in (41).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Shirazi and A. Vosoughi, “Bayesian Cramer-Rao bound for distributed vector estimation with linear observation model,” in 2014 IEEE 25th Annual International Symposium on Personal, Indoor, and Mobile Radio Communication (PIMRC) , Sep 2014, pp. 712–716.
- 2[2] ——, “Bayesian Cramer-Rao bound for distributed estimation of correlated data with non-linear observation model,” in 2014 48th Asilomar Conference on Signals, Systems and Computers , 2014, pp. 1484–1488.
- 3[3] M. Hosseini, A. S. Maida, M. Hosseini, and G. Raju, “Inception-inspired lstm for next-frame video prediction,” 2019.
- 4[4] M. J. Moghaddam, M. Hosseini, and R. Safabakhsh, “Traffic light control based on fuzzy q-leaming,” in 2015 The International Symposium on Artificial Intelligence and Signal Processing (AISP) , 2015, pp. 124–128.
- 5[5] A. Sani and A. Vosoughi, “Resource allocation optimization for distributed vector estimation with digital transmission,” in 2014 48th Asilomar Conference on Signals, Systems and Computers , 2014.
- 6[6] A. Amar, A. Leshem, and M. Gastpar, “Recursive implementation of the distributed karhunen-loève transform,” IEEE Transactions on Signal Processing , vol. 58, no. 10, pp. 5320–5330, Oct 2010.
- 7[7] L. Gispan, A. Leshem, and Y. Be’ery, “Decentralized estimation of regression coefficients in sensor networks,” Digital Signal Processing , vol. 68, pp. 16 – 23, 2017.
- 8[8] I. D. Schizas, G. B. Giannakis, and Z. Luo, “Distributed estimation using reduced-dimensionality sensor observations,” IEEE Transactions on Signal Processing , vol. 55, no. 8, pp. 4284–4299, Aug 2007.
