Non-Coherent Joint Transmission in Poisson Cellular Networks Under Pilot   Contamination

Stelios Stefanatos; Gerhard Wunder

arXiv:1903.05864·cs.IT·July 2, 2019

Non-Coherent Joint Transmission in Poisson Cellular Networks Under Pilot Contamination

Stelios Stefanatos, Gerhard Wunder

PDF

Open Access

TL;DR

This paper analyzes the impact of pilot contamination on non-coherent joint transmission in Poisson cellular networks, providing a simple SNR model that guides system optimization and reveals phase transition phenomena in cooperation strategies.

Contribution

It introduces an easy-to-compute SNR expression considering pilot contamination, enabling efficient optimization of cooperation and training parameters in cellular networks.

Findings

01

Multipoint transmissions can outperform non-cooperative operation under certain conditions.

02

A phase transition phenomenon determines the optimal number of cooperating access points.

03

Minimal training overhead is identified for maintaining target SNR levels.

Abstract

This paper investigates the performance of downlink cellular networks with non-coherent joint (mutlipoint) transmissions and practical channel estimation. Under a stochastic geometry framework, the spatial average signal-to-noise-ratio (SNR) is characterized, taking into account the effect of channel estimation error due to pilot contamination. A simple, easy to compute SNR expression is obtained under the assumption of randomly generated pilot sequences and minimal prior information about the channels and positions of access points (APs). This SNR expression allows for the efficient joint optimization of critical system design parameters such as number of cooperating APs and training overhead. Among others, it is shown that multipoint transmissions are preferable to conventional (non-cooperative) cellular operation under certain operational conditions. Furthermore, analytical insights…

Equations84

h_{x} = c_{x} ℓ (∥ x ∥),

h_{x} = c_{x} ℓ (∥ x ∥),

ℓ (r) ≜ r_{0}^{α} (max (r_{0}, r))^{- α}, r \geq 0,

ℓ (r) ≜ r_{0}^{α} (max (r_{0}, r))^{- α}, r \geq 0,

y_{d}

y_{d}

= d 1^{T} h_{C} + w_{d},

y_{p}

y_{p}

= P_{C} h_{C} + P_{\overset{ˉ}{C}} h_{\overset{ˉ}{C}} + w_{p},

\hat{h}_{C} ≜ (\frac{N _{a} ( σ _{w}^{2} + σ _{Φ}^{2} - σ _{C}^{2} )}{σ _{C}^{2}} I_{N_{a}} + P_{C}^{H} P_{C})^{- 1} P_{C}^{H} y_{p},

\hat{h}_{C} ≜ (\frac{N _{a} ( σ _{w}^{2} + σ _{Φ}^{2} - σ _{C}^{2} )}{σ _{C}^{2}} I_{N_{a}} + P_{C}^{H} P_{C})^{- 1} P_{C}^{H} y_{p},

σ_{Φ}^{2}

σ_{Φ}^{2}

= \frac{α λπ r _{0}^{2}}{α - 2},

y_{d} = d 1^{T} h_{C} + d (≜ e 1^{T} h_{C} - 1^{T} h_{C}) + w_{d} .

y_{d} = d 1^{T} h_{C} + d (≜ e 1^{T} h_{C} - 1^{T} h_{C}) + w_{d} .

SNR_{c} ≜ \frac{1 ^{T} h _{C} ^{2}}{σ _{w}^{2} + σ _{e}^{2}},

SNR_{c} ≜ \frac{1 ^{T} h _{C} ^{2}}{σ _{w}^{2} + σ _{e}^{2}},

SNR

SNR

= \frac{σ _{C}^{2} - σ _{e}^{2}}{σ _{w}^{2} + σ _{e}^{2}},

σ_{C}^{2}

σ_{C}^{2}

\approx σ_{Φ}^{2} (1 - \frac{2}{α} (\frac{λπ r _{0}^{2}}{N _{a}})^{\frac{α}{2} - 1}),

σ_{e}^{2}

σ_{e}^{2}

γ ≫ \frac{1}{N _{a} + 1},

γ ≫ \frac{1}{N _{a} + 1},

N_{p}^{*}

N_{p}^{*}

> \frac{γ N _{a}}{1 - γ}

SNR = \frac{N _{p} σ _{C}^{2}}{N _{a} ( σ _{Φ}^{2} - σ _{C}^{2} )} (1 - O (\frac{N _{a}}{N _{p}})), \frac{N _{p}}{N _{a}} \to \infty,

SNR = \frac{N _{p} σ _{C}^{2}}{N _{a} ( σ _{Φ}^{2} - σ _{C}^{2} )} (1 - O (\frac{N _{a}}{N _{p}})), \frac{N _{p}}{N _{a}} \to \infty,

\frac{σ _{C}^{2}}{N _{a} ( σ _{Φ}^{2} - σ _{C}^{2} )} \geq γ N_{a} max \frac{σ _{C}^{2}}{N _{a} ( σ _{Φ}^{2} - σ _{C}^{2} )}

\frac{σ _{C}^{2}}{N _{a} ( σ _{Φ}^{2} - σ _{C}^{2} )} \geq γ N_{a} max \frac{σ _{C}^{2}}{N _{a} ( σ _{Φ}^{2} - σ _{C}^{2} )}

E (h_{C} \tilde{w}_{p}^{H})

E (h_{C} \tilde{w}_{p}^{H})

= 0,

\hat{h}_{C} = (R_{h_{C}}^{- 1} + P_{C} R_{\tilde{w}_{p}}^{- 1} P_{C}^{H})^{- 1} P_{C}^{H} R_{\tilde{w}_{p}}^{- 1} y_{p},

\hat{h}_{C} = (R_{h_{C}}^{- 1} + P_{C} R_{\tilde{w}_{p}}^{- 1} P_{C}^{H})^{- 1} P_{C}^{H} R_{\tilde{w}_{p}}^{- 1} y_{p},

R_{h_{C}}

R_{h_{C}}

= (a) x \in C \sum x^{'} \in C \sum E (h_{x} h_{x^{'}}^{*}) E (e_{x} e_{x^{'}}^{T})

= (b) x \in C \sum E (∣ h_{x} ∣^{2}) E (e_{x} e_{x}^{T})

= (c) \frac{1}{N _{a}} x \in C \sum E (∣ h_{x} ∣^{2}) I_{N_{a}}

R_{h_{C}} = \frac{σ _{C}^{2}}{N _{a}} I_{N_{a}} .

R_{h_{C}} = \frac{σ _{C}^{2}}{N _{a}} I_{N_{a}} .

R_{\tilde{w}_{p}}

R_{\tilde{w}_{p}}

= E x \in Φ ∖ C \sum x^{'} \in Φ ∖ C \sum h_{x} h_{x^{'}}^{*} p_{x} p_{x^{'}}^{H} + σ_{w}^{2} I_{N_{a}}

= x \in Φ ∖ C \sum x^{'} \in Φ ∖ C \sum E (h_{x} h_{x^{'}}^{*}) E (p_{x} p_{x^{'}}^{H}) + σ_{w}^{2} I_{N_{a}}

= x \in Φ ∖ C \sum E (∣ h_{x} ∣^{2}) E (p_{x} p_{x}^{H}) + σ_{w}^{2} I_{N_{a}}

= (a) x \in Φ ∖ C \sum E (∣ h_{x} ∣^{2}) + σ_{w}^{2} I_{N_{a}}

= (x \in Φ \sum E (∣ h_{x} ∣^{2}) - x \in C \sum E (∣ h_{x} ∣^{2}) + σ_{w}^{2}) I_{N_{a}}

= (σ_{Φ}^{2} - σ_{C}^{2} + σ_{w}^{2}) I_{N_{a}},

σ_{Φ}^{2}

σ_{Φ}^{2}

= E (x \in Φ \sum ℓ (∥ x ∥))

= λ \int_{z \in R^{2}} ℓ (∥ z ∥) d z,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced MIMO Systems Optimization · Cooperative Communication and Network Coding · Advanced Wireless Network Optimization

Full text

Non-Coherent Joint Transmission in Poisson Cellular Networks Under

Pilot Contamination

Stelios Stefanatos and Gerhard Wunder This work has been performed in the framework of the Horizon 2020 project ONE5G (ICT-760809) receiving funds from the European Union. The authors would like to acknowledge the contributions of their colleagues in the project, although the views expressed in this contribution are those of the authors and do not necessarily represent the project. The work of G. Wunder was also supported by DFG grants WU 598/7-1 and WU 598/8-1 (DFG Priority Program on Compressed Sensing).

S. Stefanatos and G. Wunder are with the Department of Mathematics and Computer Science, Freie Universität Berlin, 14195, Berlin, Germany, e-mail: {stelios.stefanatos, g.wunder}@fu-berlin.de.

Abstract

This paper investigates the performance of downlink cellular networks with non-coherent joint (mutlipoint) transmissions and practical channel estimation. Under a stochastic geometry framework, the spatial average signal-to-noise-ratio (SNR) is characterized, taking into account the effect of channel estimation error due to pilot contamination. A simple, easy to compute SNR expression is obtained under the assumption of randomly generated pilot sequences and minimal prior information about the channels and positions of access points (APs). This SNR expression allows for the efficient joint optimization of critical system design parameters such as number of cooperating APs and training overhead. Among others, it is shown that multipoint transmissions are preferable to conventional (non-cooperative) cellular operation under certain operational conditions. Furthermore, analytical insights are obtained regarding (a) the minimum training overhead required to achieve a given SNR degradation compared to the perfect channel estimation case and (b) the optimal number of cooperating APs when an arbitrarily large training overhead can be afforded. For the latter, in particular, a phase transition phenomenon is identified, where the optimal number of cooperating APs is either finite or infinite, depending on whether the path loss factor is less or equal than a certain value, respectively.

Index Terms:

stochastic geometry, pilot contamination, multipoint transmission, cooperation, channel estimation, training overhead

I Introduction

The densification of the cellular network infrastructure results in any user equipment (UE) in the system likely to be positioned in the proximity of multiple access points (APs), which naturally promotes mutlipoint transmission techniques [1]. Considering the downlink of a frequency division duplex (FDD) system, non-coherent joint transmission (NCJT), where the serving APs transmit the same signal without any precoding [2], is particularly attractive as it eliminates the need for channel state information (CSI) at the APs and, in turn, the associated feedback overhead. However, downlink CSI is still required at the UE side in order to (coherently) decode the received signal. Towards estimating the (effective) downlink channel, a training phase preceding the data transmission phase is commonly employed, during which APs transmit known pilot (training) sequences.

Of course, the quality of the channel estimate will have an impact on the performance, since the residual channel estimation error effectively increases the noise level [3]. It is therefore important to design the training phase such that channel estimates of sufficient accuracy can be obtained. For the dense cellular setting, this design should not only have to take into account the effects of the ambient additive noise at the UE side but also the effect of interference from pilot transmissions by nearby APs not serving the UE under consideration. The latter effect is particularly dominant in the massive MIMO setting, commonly referred to as pilot contamination [4]. Nevertheless, pilot contamination is present in every network with non-orthogonal pilot transmissions, even with single antenna transceivers. In addition, it is desirable to design the training phase such that the overhead associated with the training phase itself, as well as the acquisition of any prior channel information employed at the UE side, is minimized. This is especially important in scenarios with small channel coherence intervals (mobility). Clearly, towards achieving these challenging design goals, a tractable characterization of the system performance that captures the effects of both channel estimation error and overhead is desirable.

I-A Previous Work

The theoretical benefits of multipoint transmissions under perfect CSI at the UE(s) and, possibly, at the AP(s) side are well documented in the information-theoretic literature [5], under simplistic assumptions on the AP positions. However, as the network infrastructure becomes more dense, the AP positions exhibit a random behavior, which is expected to have impact on the system performance.

Towards incorporating the effects of large scale fading and randomness of AP positions, stochastic geometry (SG) has become a popular modeling tool that allows for tractable performance characterization of cellular networks [6]. Under a perfect CSI assumption, SG has been successfully employed to obtain simple closed form expressions for the signal-to-noise-ratio ( $\mathsf{SNR}$ ) coverage probabilities of the conventional (non-cooperative) cellular network (e.g., see [7] for downlink and [8] for uplink analysis). However, extension of this approach to multipoint transmissions and/or imperfect channel estimation faces significant analytical complications.

Focusing on NCJT systems that are of interest in this paper, [9] provides the exact expression for the $\mathsf{SNR}$ coverage probability under perfect CSI. This expression requires the numerical computation of multiple integrals, restricting its practical application to clusters of only up to $3$ cooperative APs. Under the same setting, [10] provides more computationally friendly coverage probability expressions based on various approximations, however, the expressions are still quite complicated to offer significant analytical insights. Similar remarks hold also for the NCJT performance analysis under perfect CSI presented in [11].

Regarding the effects of channel estimation error, a surprisingly limited number of works under the SG framework exist. Most of these works consider the non-cooperative massive MIMO setting [12, 13, 14], which allows certain analytical simplifications due to the channel hardening effect present there [15]. However, employing the same analysis in a non-massive MIMO setting results in a crude approximation as channel hardening no longer holds [16]. In addition, [12, 13] consider a Bayesian channel estimation with knowledge of every AP-UE distance in the system. This assumption implies a signaling overhead for obtaining this information whose effect is not taken into account in the analysis. The only SG work in a non-massive MIMO setting that explicitly focuses on imperfect CSI effects appears to be [17]. Analysis is limited to a non-cooperative transmission scenario, assuming perfect knowledge of the serving AP distance, and the resulting coverage probability expression provides limited analytical insights. The cost of imperfect CSI in NCJT is briefly treated also in [10], however, assuming the availability of orthogonal (i.e., non-interfering) pilot sequences.

I-B Contributions

This paper attempts to provide a tractable system performance characterization under NCJT and practical channel estimation for a non-massive MIMO setting using the SG modeling framework. The ultimate goal is to obtain a simple metric that can be used to not only numerically optimize important system design parameters such as number of cooperating APs and training overhead, but also provide analytical insights on how their optimal values depend on operational conditions such as path loss factor, average transmit power and ambient noise level. In particular, the following paper contributions can be identified.

•

Towards minimum signaling overhead, channel estimation under minimal assumptions on the prior information about AP channels and positions is considered. This renders the channels to be estimated non Gaussian and specification of the channel estimator is not straightforward. In addition, pilot sequences of small length (overhead) are considered, which are necessarily non-orthogonal and hence introduce pilot contamination effects. By considering randomly generated AP pilot sequences, which requires no coordination/overhead for pilot assignment, a closed form expression for the linear minimum mean square error (LMMSE) estimator as well as its mean square error is provided.

•

Instead of the standard, $\mathsf{SNR}$ coverage probability considered in previous works, the less informative but much more tractable spatial average $\mathsf{SNR}$ is considered as the performance metric of the system. The $\mathsf{SNR}$ computation takes explicitly the effect of residual channel estimation error into account and the resulting expression highlights the dependence of the $\mathsf{SNR}$ on channel estimation accuracy, number of cooperative APs and noise level. It is shown that, even though the conventional, non-cooperative cellular network is sufficient in terms of $\mathsf{SNR}$ under perfect CSI, a cooperative cluster of $2$ or more APs can provide better performance under imperfect channel estimation and operational conditions where the impact of pilot contamination is greater than that of additive noise in the training phase.

•

The $\mathsf{SNR}$ expression is numerically optimized with respect to (w.r.t.) critical system design parameters such as number of cooperative APs and pilot sequence length (training overhead), where it is shown that cooperation of multiple APs, potentially much greater than $2$ , is mostly beneficial under propagation conditions with a large path loss factor. In addition, interesting analytical insights are obtained by manipulation of the $\mathsf{SNR}$ expression regarding (a) the minimum training overhead required to achieve a given $\mathsf{SNR}$ degradation compared to the perfect CSI case and (b) the optimal number of cooperating APs when an arbitrarily large training overhead can be afforded. For the latter, in particular, a phase transition phenomenon is identified, where the optimal number of cooperating APs is either finite or infinite, depending on the path loss factor being less or equal than a certain value, respectively.

I-C Notation

Boldface lower (upper) case letters denote column vectors (matrices). The transposition and Hermitian transposition are denoted by $(\cdot)^{T}$ and $(\cdot)^{H}$ , respectively. A scalar random variable will be said to be distributed as $\mathcal{CN}(0,\sigma^{2})$ if it is distributed as a circularly-symmetric complex Gaussian random variable of zero mean and variance $\sigma^{2}$ . The expectation operator is denoted by $\mathbb{E}(\cdot)$ . The Euclidean norm of a vector $\mathbf{x}$ is denoted by $\|\mathbf{x}\|$ . The $N\times N$ identity matrix is denoted by $\mathbf{I}_{N}$ and $\mathbf{1}$ denotes the all-ones column vector whose dimension will be clear from context. For two positive-valued functions $f,$ $g$ , the notation $f(x)=\mathcal{O}(g(x)),x\rightarrow\infty(x\rightarrow 0)$ , is used to represent the condition $f(x)\leq Mg(x),x>x_{0}(x<x_{0})$ , for a sufficiently large (small) $x_{0}>0$ and some constant $M>0$ that is independent of $x$ .

II System Model

The downlink of a dense cellular network covering a wide geographical area is considered. The locations on the plane of the APs are modeled as a realization of a homogeneous Poisson point process (HPPP) $\Phi\subset\mathbb{R}^{2}$ of density $\lambda>0$ (average number of APs per unit area) [18]. All APs and UEs in the system are equipped with a single antenna. By the stationarity of the HPPP [19], a typical UE located at the origin of the plane will be considered in the following.

The baseband-equivalent flat fading channel $h_{\mathbf{x}}$ , corresponding to the downlink between an AP located at $\mathbf{x}\in\Phi$ and the typical UE, follows the standard model [19]

[TABLE]

where $c_{\mathbf{x}}\in\mathbb{C}$ represents the small scale fading and $\ell:[0,\infty)\rightarrow(0,1]$ is a path loss function governing the large scale fading. The variables $\{c_{\mathbf{x}}\}_{\mathbf{x}\in\Phi}$ are assumed to be i.i.d. as $\mathcal{CN}(0,1)$ (Rayleigh fading model), while the large scale fading is modeled as [20]

[TABLE]

where $\alpha>2$ is the path loss factor and $r_{0}>0$ is the reference distance, i.e., $\ell(r)=1$ , for $r\leq r_{0}$ .

For downlink communications, a cooperative mutlipoint transmission scheme based on user-cetric adaptive clustering is employed [21]. In particular, the typical UE is served by its $N_{a}$ closest in distance APs, with $N_{a}$ a design parameter that depends in principle on the operational conditions. An example of an AP distribution realization and cluster configuration is shown in Fig. 1. The serving cluster employs NCJT, i.e., all APs in the cluster simply transmit the same signal without any (joint) precoding, which has the benefit the the APs require no CSI [9, 10]. In addition, it is assumed that there are no interfering transmissions from out-of-cluster APs during the data transmission phase for the typical UE, by means of system-wide UE scheduling and/or resource allocation. The corresponding system performance serves as an upper bound when interference exists during data transmission.

With all cluster APs transmitting with the same average power, the received signal at the UE side during data transmission and after normalization with the average transmit power equals [9]

[TABLE]

where $d\in\mathbb{C}$ is a data symbol of zero mean and variance one, $\mathcal{C}\subseteq\Phi$ is the set of locations of the $N_{a}$ cluster APs,111 $N_{a}$ can become infinite by considering $\mathcal{C}=\Phi$ . $\mathbf{h}_{\mathcal{C}}\in\mathbb{C}^{N_{p}}$ is a vector with elements $\{h_{\mathbf{x}}\}_{\mathbf{x}\in\mathcal{C}}$ in some arbitrary order, and $w_{d}\in\mathbb{C}$ is additive noise distributed as $\mathcal{CN}(0,\sigma_{w}^{2})$ with $\sigma_{w}^{2}$ representing the ratio of the additive noise variance to the average transmit power.

For coherent processing of the received signal, the typical UE employs an estimate of the effective NCJT channel $\mathbf{1}^{T}\mathbf{h}_{\mathcal{C}}$ appearing in (3), which is obtained by means of a training phase preceding the data transmission phase. During the training phase, all APs in the system transmit their pilot sequences of length $N_{p}$ symbols (channel uses) so that all UEs in the system are able to simultaneously obtain an estimate of their NCJT channels. Assuming, without loss of generality, that all APs in the system transmit during the training phase with the same power as during the data transmission phase (when active), the typical UE observes the $N_{p}$ -dimensional transmit-power-normalized training signal

[TABLE]

where $\mathbf{p}_{\mathbf{x}}\in\mathbb{C}^{N_{p}}$ is the pilot sequence of the AP located at $\mathbf{x}\in\Phi$ , $\mathbf{w}_{p}\in\mathbb{C}^{N_{p}}$ is an additive noise vector whose elements are i.i.d. as $\mathcal{CN}(0,\sigma_{w}^{2})$ , $\mathbf{P}_{\mathcal{C}}\in\mathbb{C}^{N_{p}\times N_{a}}$ is a matrix with columns $\{\mathbf{p}_{\mathbf{x}}\}_{\mathbf{x}\in\mathcal{C}}$ ordered in accordance to the AP ordering in $\mathbf{h}_{\mathcal{C}}$ , and $\mathbf{h}_{\mathcal{\bar{C}}}$ , $\mathbf{P}_{\mathcal{\bar{C}}}$ are similarly defined to $\mathbf{h}_{\mathcal{C}}$ , $\mathbf{P}_{\mathcal{C}}$ , for the channels and pilot sequences of out-of-cluster APs, respectively.222Note that $\mathbf{h}_{\mathcal{\bar{C}}}$ , $\mathbf{P}_{\mathcal{\bar{C}}}$ have an infinite number of elements and columns, respectively.

Ideally, the pilot sequences $\{\mathbf{p}_{\mathbf{x}}\}_{\mathbf{x}\in\Phi}$ should be jointly optimized according to the AP and UE locations [22, 23]. However, this approach requires computationally intensive solvers as well as additional signaling overhead to obtain location information. Towards avoiding this, and inspired by the line of works on code-division multiple access with random spreading sequences [24, 25], it is assumed in this paper that the AP pilot sequences are generated randomly and independently, with elements that are i.i.d. as $\mathcal{CN}(0,1)$ .

Note that this approach guarantees with probability $1$ (w.p. $1$ ) that the pilot sequences are unique, in contrast to the common assumption of all APs in the system sharing a limited set of pilot sequences [4], which inherently results in (a) channel identifiability issues due to $\mathbf{P}_{\mathcal{C}}$ not being full column rank when two or more cluster APs are assigned the same sequence, and (b) pilot contamination issues due to out-of-cluster APs having the same training sequences as cluster APs. Even though with the random pilot assignment there are no identifiability issues since $\mathbf{P}_{\mathcal{C}}$ is full rank w.p. $1$ , pilot contamination is still present, since, w.p. $1$ , it holds $\mathbf{p}_{\mathbf{x}}^{H}\mathbf{p}_{\mathbf{x}^{\prime}}\neq 0$ , for all $\mathbf{x}\in\mathcal{C}$ and $\mathbf{x}^{\prime}\in\Phi\setminus\mathcal{C}$ . This non-orthogonality of the pilot sequences results in a degradation of the channel estimate quality for the cluster AP channels due to interference from out-of-cluster APs. One approach to avoid pilot contamination is to arbitrarily increase $N_{p}$ so as to achieve (approximate) orthogonality of pilot sequences by application of the law of large numbers. However, this approach introduces unacceptable overhead. It is therefore of interest to investigate the system performance under finite $N_{p}$ , a topic that is pursued in the following.

III LMMSE Channel Estimation and Data Detection $\mathsf{SNR}$

This section provides a closed-form expression for the (effective) data detection $\mathsf{SNR}$ , taking into account channel estimation errors, which serves as a reasonable metric for performance characterization and design of parameters $N_{a}$ and $N_{p}$ . The $\mathsf{SNR}$ expression depends on two important quantities reflecting the NCJT channel energy and the channel estimation quality, which are investigated in detail, giving also insights on the optimal $N_{a}$ and $N_{p}$ . The first step towards this investigation is the specification of the channel estimator, which is discussed next.

III-A LMMSE Channel Estimation

For data detection (decoding), the UE processes the received signal $y_{d}$ during the data transmission phase under the knowledge of an estimate $\widehat{\mathbf{1}^{T}\mathbf{h}_{\mathcal{C}}}$ of the NCJT channel obtained from $\mathbf{y}_{p}$ . In this paper, the standard LMMSE estimator is employed. As the LMMSE estimator is a Bayesian estimator, the prior information available at the UE side on the AP channels and pilot sequences must be specified.

Regarding the AP channel information, the standard assumption made in previous works on Bayesian multipoint channel estimation is that the AP distances from the typical UE are perfectly known [12, 13, 15, 17]. This effectively renders the channel estimation problem equivalent to the estimation of the Gaussian coefficients $\{c_{\mathbf{x}}\}_{\mathbf{x}\in\mathcal{C}}$ and the specification of the estimator is straightforward. However, this location information requires a dedicated signaling overhead to acquire it, which may not be acceptable, at least in the case of channels with small coherence intervals due to, e.g., mobility of UEs.

In this paper, a worst-case assumption is considered in terms of prior information available at the UE side about both the AP channels and the AP pilot sequences, corresponding to a minimum signaling overhead required to obtain it. Clearly, analysis under this approach can be treated as providing a lower performance bound when additional information is available.

Assumption.

The typical UE has no prior information about the small scale fading and location (distance) of any AP in the system, including the APs in the serving cluster. It is only aware of the channel model of (1) and (2), the values of $\lambda$ , $\alpha$ and $\sigma_{w}^{2}$ , and the pilot sequences $\{\mathbf{p}_{\mathbf{x}}\}_{\mathbf{x}\in\mathcal{C}}$ of the $N_{a}$ cluster APs.

With the prior information at the UE side specified, the LMMSE estimator can now be obtained.

Proposition 1.

The LMMSE estimate of the NCJT channel equals $\widehat{\mathbf{1}^{T}\mathbf{h}_{\mathcal{C}}}=\mathbf{1}^{T}\hat{\mathbf{h}}_{\mathcal{C}}$ , where

[TABLE]

is the LMMSE estimate of $\mathbf{h}_{\mathcal{C}}$ , $\sigma_{\mathcal{C}}^{2}\triangleq\mathbb{E}(|\mathbf{1}^{T}\mathbf{h}_{\mathcal{C}}|^{2})$ is the (average) NCJT channel energy and

[TABLE]

is the (average) NCJT channel energy when all APs in the system are included in the cluster ( $\mathcal{C}=\Phi$ ).

Proof:

See Appendix A. ∎

*Remark 2**.*

By the linearity of the LMMSE estimator [26], estimation of the NCJT channel $\mathbf{1}^{T}\mathbf{h}_{\mathcal{C}}$ is effectively equivalent to the estimation of the $N_{a}$ AP channels in $\mathbf{h}_{\mathcal{C}}$ . Note that under the assumed prior information available at the UE, $\mathbf{h}_{\mathcal{C}}$ is not Gaussian, therefore, its LMMSE estimator of (5) does not coincide with its (non-linear) minimum mean square error (MMSE) estimator, which is much more complicated to compute.

In order to fully specify the LMMSE channel estimator of Proposition 1, the NCJT channel energy $\sigma_{\mathcal{C}}^{2}$ must be obtained. This is done after the specification of the data detection $\mathsf{SNR}$ presented next, as $\sigma_{\mathcal{C}}^{2}$ also affects the $\mathsf{SNR}$ and, therefore, has strong implications on the data transmission performance.

III-B Data Detection $\mathsf{SNR}$

For coherent detection purposes, the UE effectively treats $y_{d}$ as equal to [3]

[TABLE]

This is merely a rewriting of (3), however, the form of (7) highlights the assumption made for data detection purposes that data is transmitted via a channel $\widehat{\mathbf{1}^{T}\mathbf{h}_{\mathcal{C}}}$ (and not $\mathbf{1}^{T}\mathbf{h}_{\mathcal{C}}$ , whose value is unknown) and affected by a noise term $de+w_{d}$ combining the effects of additive noise and residual channel estimation error. By standard results from linear estimation theory [26], this noise term is uncorrelated with the useful signal $d\widehat{\mathbf{1}^{T}\mathbf{h}_{\mathcal{C}}}$ . Therefore, the effective data detection SNR for the signal model of (7), conditioned on the LMMSE channel estimate, can be defined as

[TABLE]

where $\sigma_{e}^{2}\triangleq\mathbb{E}(\left|e\right|^{2})$ is the mean square error of the LMMSE estimate that is independent of its actual value [26]. By averaging $\mathsf{SNR}_{c}$ over the channel estimate statistics, an expression for the unconditioned effective $\mathsf{SNR}$ is immediately obtained as

[TABLE]

after using the property $\mathbb{E}(|\chi|^{2})=\mathbb{E}(|\hat{\chi}|^{2})+\mathbb{E}(|\chi-\hat{\chi}|^{2})$ , where $\chi\in\mathbb{C}$ is a random variable and $\hat{\chi}$ its LMMSE estimate. By the ergodicity of the HPPP [20], $\mathsf{SNR}$ equals the spatial average $\mathsf{SNR}_{c}$ for any realization of the AP point process $\Phi$ , which can be interpreted as the value of $\mathsf{SNR}_{c}$ averaged over all UEs in the system when user-centric NCJT with the same $N_{a}$ is considered for every UE. Note that $\mathsf{SNR}\leq\sigma_{\mathcal{C}}^{2}/\sigma_{w}^{2}$ , with the upper bound achieved when $\hat{\mathbf{h}}_{\mathcal{C}}=\mathbf{h}_{\mathcal{C}}$ (perfect channel estimation).

Towards further specification of the $\mathsf{SNR}$ formula and obtaining performance insights, quantities $\sigma_{\mathcal{C}}^{2}$ and $\sigma_{e}^{2}$ appearing in (8) are investigated next.

III-C Computation of NJCT Channel Energy

The following result provides an integral-form and an approximate closed-form expression for $\sigma_{\mathcal{C}}^{2}$ , required by the LMMSE channel estimator as well as for computing the $\mathsf{SNR}$ .

Proposition 3.

The energy of the NCJT channel equals

[TABLE]

where $\Gamma(a,z)\triangleq\int_{z}^{\infty}t^{a-1}e^{-t}dt$ is the (lower) incomplete gamma function and the approximation of (10) holds for asymptotically large $N_{a}$ .

Proof:

See Appendix B. ∎

Note that the approximate asymptotic expression of (10) is tight in the sense that it reaches the limit $\sigma_{\Phi}^{2}$ as $N_{a}\rightarrow\infty$ ( $\mathcal{C}\rightarrow\Phi$ ). In addition, (10) shows that (a) $\sigma_{\mathcal{C}}^{2}$ is larger for smaller $\alpha$ , as expected, due to the smaller propagation losses, and (b) increasing $N_{a}$ increases $\sigma_{\mathcal{C}}^{2}$ , implying that, under perfect CSI, NCJT with $N_{a}>1$ provides an $\mathsf{SNR}$ gain compared to conventional, non-cooperative transmission ( $N_{a}=1$ ).

Figure 2 shows $(\sigma_{\Phi}^{2}-\sigma_{\mathcal{C}}^{2})/\sigma_{\Phi}^{2}$ as a function of $N_{a}$ for various values of $\alpha$ , using both the exact and approximate expressions for $\sigma_{\mathcal{C}}^{2}$ . This quantity measures how smaller the NCJT channel energy of a finite AP cluster is from the case with all APs included in the cluster. For this example, as well as all the numerical examples in this paper, the reference distance was set to $r_{0}=0.08/(2\sqrt{\lambda})$ , i.e., $8/100$ of the average distance from the closest AP. This corresponds to a probability $1-e^{-\lambda\pi r_{0}^{2}}\approx 5\times 10^{-3}$ of an AP in the system having a distance from the typical UE smaller than $r_{0}$ . Note that (10) suggests that $\sigma_{\mathcal{C}}^{2}$ becomes conveniently independent of $\lambda$ when $r_{0}=\text{const.}/\sqrt{\lambda}$ , which can actually be shown to be the case also for the exact expression of (9).

It can be seen from Fig. 2 that the asymptotic expression of (10) is a very good indicator of the actual $\sigma_{\mathcal{C}}^{2}$ , even for moderate values of $N_{a}$ . As predicted by (10), increasing $N_{a}$ increases $\sigma_{\mathcal{C}}^{2}$ towards $\sigma_{\Phi}^{2}$ , however, this increase is only marginal. This is because, in order to capture most of the maximum possible NCJT channel energy, a value of $N_{a}=1$ is sufficient when $\alpha$ is large, as the average received energy from even the second closest AP is much smaller than that of the closest AP, whereas a very large $N_{a}$ is required when $\alpha$ is small, since the signals of even distant APs are strongly received. This observation suggests that the conventional, non-cooperative, cellular network operation ( $N_{a}=1$ ) is practically sufficient under perfect CSI when the $\mathsf{SNR}$ is the figure of merit for the system. However, as will be shown in the following, $N_{a}>1$ can provide significant $\mathsf{SNR}$ gains in the presence of channel estimation errors.

III-D *Computation of LMMSE

Channel Estimation Error Variance*

The following result provides a closed form asymptotic approximation for the channel estimation error variance $\sigma_{e}^{2}$ required for the $\mathsf{SNR}$ computation.

Proposition 4.

For $N_{p},N_{a}\rightarrow\infty$ with the ratio $N_{a}/N_{p}$ constant, the channel estimation mean square error approximately equals

[TABLE]

where $f_{a}(b)\triangleq\frac{a-1+\sqrt{(1+a-ab)^{2}-4a^{2}b}}{2a^{2}b}-\frac{1}{2a}$ .

Proof:

See Appendix C. ∎

As noted in the proof of this result, an exact, non-asymptotic expression for $\sigma_{e}^{2}$ is available, which, however, is a function of the random pilot sequences matrix $\mathbf{P}_{\mathcal{C}}$ . The asymptotic expression of (11) is much more convenient as it applies to every realization of $\mathbf{P}_{\mathcal{C}}$ and is also very accurate even for small values of $N_{p}$ and $N_{a}$ . This is verified in Fig. 3 where $\sigma_{e}^{2}$ is plotted as a function of $N_{p}$ , evaluated both using Monte Carlo simulation and the expression of (11). The simulation results are obtained by averaging the channel estimation square error $|e|^{2}$ over independent realizations of $\Phi$ , $\{h_{\mathbf{x}}\}_{\mathbf{x}\in\Phi}$ , and $\{\mathbf{p}_{\mathbf{x}}\}_{\mathbf{x}\in\Phi}$ . As (11) is independent of $\lambda$ , a value of $\lambda=1$ was arbitrarily chosen for generating the AP distribution realizations. Various values of $N_{a}$ were considered and $\sigma_{w}^{2}$ was set such that $\mathsf{SNR}_{0}=\{0,40\}$ dB, where $\mathsf{SNR}_{0}\triangleq\mathbb{E}(|h_{\mathbf{x}_{1}}|^{2})/\sigma_{w}^{2}$ is the $\mathsf{SNR}$ achieved with $N_{a}=1$ and perfect CSI ( $h_{\mathbf{x}_{1}}$ is the channel of the closest AP). Note that $\mathbb{E}(|h_{\mathbf{x}_{1}}|^{2})$ can be computed from Proposition 3 as $\sigma_{\mathcal{C}}^{2}$ with $N_{a}=1$ . These two $\mathsf{SNR}_{0}$ values were chosen as representative of operational conditions were the effect of additive noise is dominant (power limited regime) and the effect of interference is dominant (pilot contamination regime) in the channel estimation procedure, respectively. The path loss factor was set to $\alpha=3.67$ , a value commonly considered by cellular standards [27], with similar results observed for other values of $\alpha$ .

As expected, the channel estimation mean square error decreases with increasing $N_{p}$ in all cases. In the power limited regime ( $\mathsf{SNR}_{0}=0$ dB), the NCJT channel estimate degrades with increasing $N_{a}$ , irrespective of the pilot sequence length $N_{p}$ . This is because attempting to estimate more parameters (in this case, AP channels) in the presence of additive noise can only degrade performance [26], even more so since the channels corresponding to distant APs have (very) small energy. However, when the channel estimation is mostly affected by pilot contamination ( $\mathsf{SNR}_{0}=40$ dB), the optimal $N_{a}$ has to balance two conflicting requirements: (a) estimate only a few channels corresponding to strongly received AP signals, and (b) consider a large cluster size that results in a statistically small out-of-cluster interference. As can be seen in Fig. 3, the optimal $N_{a}$ for $\mathsf{SNR}_{0}=40$ dB depends on $N_{p}$ and becomes greater than $1$ for large $N_{p}$ . Given that a greater $N_{a}$ also corresponds to a greater NJCT channel energy $\sigma_{\mathcal{C}}^{2}$ , it directly follows from (8) that $N_{a}>1$ is optimal w.r.t. $\mathsf{SNR}$ for sufficiently large $N_{p}$ in the pilot contamination regime.

IV On the Optimal Pilot Sequence Length and AP Cluster Size

Propositions 3 and 4 allow for a simple computation of the $\mathsf{SNR}$ without the need to resort to computationally intensive Monte Carlo simulations. This, in turn, allows for efficient (numerical) optimization of the two fundamental system design parameters $N_{p}$ and $N_{a}$ . However, some interesting analytical insights under specific operational conditions can be obtained.

IV-A *Minimum Required Pilot

Sequence Length*

Consideration of the $\mathsf{SNR}$ as a metric for identifying the optimal $N_{p}$ leads to impractical designs since, clearly, the maximum $\mathsf{SNR}$ performance corresponding to perfect CSI is achieved with $N_{p}\rightarrow\infty$ . Towards a practical design rule, it is reasonable to look for the minimum $N_{p}$ required to achieve an $\mathsf{SNR}$ that is a given fraction of the perfect CSI $\mathsf{SNR}$ . The following result provides a simple expression for this value.

Proposition 5.

For fixed cluster size $N_{a}$ and $0<\gamma<1$ such that it holds

[TABLE]

the (minimum) pilot sequence length required to achieve $\mathsf{SNR}\geq\gamma\sigma_{\mathcal{C}}^{2}/\sigma_{w}^{2}$ approximately equals

[TABLE]

Proof:

See Appendix D. ∎

Note that condition (12) holds when $\gamma$ is selected sufficiently close to $1$ and/or $N_{a}$ is sufficiently small, which is typically the case in system design towards minimizing degradation due to channel estimation error and minimizing the physical resources (APs) dedicated to each UE.

The simple bound of (14) suggests that $N_{p}^{*}$ must be greater than a value that is proportional to $N_{a}$ , as expected, however, with a proportionality constant $\gamma/(1-\gamma)$ that becomes exponentially large as $\gamma\rightarrow 1$ , i.e., achieving performance extremely close to the perfect CSI case requires excessively large overhead irrespective of the operational conditions. In addition, the closed-form expression of (13) reveals that $N_{p}^{*}$ is a convex function of the noise variance $\sigma_{w}^{2}$ with its extremal behavior following by simple inspection.

Corollary 6.

In the power limited ( $\sigma_{w}^{2}\rightarrow\infty$ ) and pilot contamination ( $\sigma_{w}^{2}\rightarrow 0$ ) operational regimes, $N_{p}^{*}\rightarrow\infty$ .

The unbounded pilot overhead as $\sigma_{w}^{2}\rightarrow\infty$ and $\sigma_{w}^{2}\rightarrow 0$ is required in order to average the noise effects in the power limited regime and orthogonalize the AP pilot sequences in the pilot contamination regime. Noting that the perfect CSI $\mathsf{SNR}$ in the power limited regime is fundamentally small, it follows that operation in this regime is highly undesirable both from a pilot overhead and achieved $\mathsf{SNR}$ perspective. This is essentially a system level manifestation of the well known fact that pilot-aided coherent processing in point-to-point communications is highly suboptimal in the power limited regime [3]. Although the pilot overhead requirements are also very large in the pilot contamination regime, it is noted that, since the perfect CSI $\mathsf{SNR}$ is extremely large there (approaching infinity), there is no practical need to consider a large $\gamma$ as in (12), and a moderate value of $N_{p}$ would be sufficient.

IV-B Optimal Cluster Size with Arbitrarily Large Training Overhead

Another important design question is the identification of the optimal cluster size $N_{a}$ given that a pilot sequence length $N_{p}$ can be afforded. As was discussed in Sec. III-D, a cluster with $N_{a}>1$ APs is $\mathsf{SNR}$ -optimal in the pilot contamination regime with sufficiently large $N_{p}$ . Intuitively, this is because a larger $N_{p}$ allows for the accurate estimation of multiple AP channels, which, in turn, also results in a larger cluster and a reduced out-of-cluster interference. One may then wonder whereas this implies that $N_{a}$ becomes arbitrarily large when an arbitrarily large $N_{p}$ can be afforded. The following result shows that this is the case only for a sufficiently large path loss factor.

Proposition 7.

When an arbitrarily large (but finite) pilot sequence length $N_{p}$ can be afforded, the AP cluster size $N_{a}$ that maximizes $\mathsf{SNR}$ with $\sigma_{w}^{2}=0$ (pilot contamination regime) is bounded for $\alpha\leq 4$ and arbitrarily large for $\alpha>4$ .

Proof:

See Appendix E. ∎

This result can be explained by noting that, for small $\alpha$ , the total out-of-cluster interference will always be very strong, irrespective of the cluster size, rendering the estimation of too many AP channels, most of them of small energy, highly inaccurate. Interestingly, the path loss factor $\alpha=4$ is identified as a phase transition threshold from a bounded to an unbounded value for the optimal $N_{a}$ .

*Remark 8**.*

It follows from the proof of Proposition 7 that, when $\sigma_{w}^{2}=0$ , the $\mathsf{SNR}$ admits the simple expression

[TABLE]

that is valid for all $\alpha$ , clearly showing that, for any fixed cluster size $N_{a}$ , the $\mathsf{SNR}$ increases proportionally to $N_{p}$ .

V Numerical Examples

This section considers a number of representative numerical examples providing insights on the $\mathsf{SNR}$ properties and the optimization of $N_{p}$ and/or $N_{a}$ . Unless stated otherwise, the $\mathsf{SNR}$ is computed using the analytical expression of (8) with the exact $\sigma_{\mathcal{C}}^{2}$ expression of (9) and with $\sigma_{e}^{2}$ as in (11). As in Secs. III-C and III-D, the reference distance was set to $r_{0}=0.08/(2\sqrt{\lambda})$ , which renders the results independent of $\lambda$ (a value of $\lambda=1$ was arbitrarily used for the Monte Carlo simulations).

1) $\mathsf{SNR}$ dependence on $N_{p}$ and $N_{a}$ : Figure 4 shows the $\mathsf{SNR}$ as a function of $N_{p}$ for a few representative values of $N_{a}$ and $\alpha$ . The noise variance was set such that $\mathsf{SNR}_{0}=50$ dB (pilot contamination regime). It can be seen that in all cases, increasing $N_{p}$ increases $\mathsf{SNR}$ due to the corresponding improvement of the NCJT channel estimate. For the parameters considered, a trivial cluster size with $N_{a}=1$ is the best only for very small values of $N_{p}$ . Larger values of $N_{p}$ favor greater $N_{a}$ , in line with the observations made in Sec. III-D regarding channel estimation error. In addition, the $\mathsf{SNR}$ gain achieved by using $N_{a}>1$ instead of $N_{a}=1$ is more pronounced for larger $\alpha$ . For the parameters considered in this example, a gain of about $10$ dB and $3.5$ dB is achieved when $\alpha=5$ and $3.67$ , respectively, for all $N_{p}>10$ .

2) Optimality of multipoint transmissions: As observed previously, a sufficiently large $N_{p}$ will render a multipoint cluster size ( $N_{a}>1$ ) $\mathsf{SNR}$ -optimal in the pilot contamination regime. In order to gain insights on what values of $N_{p}$ promote multipoint transmissions, Fig. 5 plots the value of $N_{p}$ above which $N_{a}=2$ provides a larger $\mathsf{SNR}$ than $N_{a}=1$ . A wide range of $\mathsf{SNR_{0}}$ values is considered covering the pilot contamination as well as the power limited regime. For visualization purposes, the plotted $N_{p}$ values were obtained as the real number for which the $\mathsf{SNR}$ for $N_{a}=1$ and $N_{a}=2$ is the same. It can be seen that both the operational regime as well as the path loss factor $\alpha$ critically affect the optimality of multipoint transmissions and required $N_{p}$ . In the power contamination regime (large $\mathsf{SNR}_{0}$ ), the minimum $N_{p}$ for which $N_{a}=2$ becomes optimal ranges from extremely large to as small as $1$ with increasing $\alpha$ . Essentially, multipoint transmission is always optimal in this regime for values of $\alpha$ greater than $4$ . On the contrary, in the power limited regime (small $\mathsf{SNR}_{0}$ ), smaller values of $N_{p}$ for which $N_{a}=2$ is optimal are favored by smaller $\alpha$ , although the corresponding $N_{p}$ never falls below $40$ and becomes arbitrarily large as $\mathsf{SNR_{0}}$ further decreases. This suggests that $N_{a}=1$ is optimal in the pilot limited regime for all $\alpha$ .

3) Dependence of minimum $N_{p}$ on $\sigma_{w}^{2}$ and $N_{a}$ : As a design problem example, consider the identification of the (minimum) $N_{p}$ , $N_{p}^{*}$ , required to achieve an $\mathsf{SNR}$ that is $1$ dB less than $\mathsf{SNR_{0}}$ , the $\mathsf{SNR}$ achieved with $N_{a}=1$ and perfect CSI. Note that this design entails solving the inequality $\mathsf{SNR}\geq 10^{-1/10}\mathsf{SNR}_{0}$ w.r.t. $N_{p}$ , which can be efficiently done numerically using the analytical $\mathsf{SNR}$ expression. Figure 6 shows the solution as a function of $\mathsf{SNR}_{0}$ for two cases of $\alpha$ and for $N_{a}=1,2,$ as well as $N_{a}=N_{a}^{*}$ , where $N_{a}^{*}$ is the value of $N_{a}$ , depending on $\mathsf{SNR_{0}}$ , that leads to a minimum $N_{p}^{*}$ . The value of $N_{a}^{*}$ is also indicated over the $\mathsf{SNR}_{0}$ range. In addition, the approximate $N_{p}^{*}$ expression of (13) with $\gamma=10^{-1/10}$ and $\gamma=10^{-1/10}\sigma_{2}^{2}/\sigma_{1}^{2}$ , for $N_{a}=1$ and $2$ , respectively, is also shown, ( $\sigma_{i}^{2},i=1,2$ , representing the NCJT channel energy with $N_{a}=i$ ).

It can be seen that the approximate closed-form expression for $N_{p}^{*}$ is a very good match to the actual $N_{p}^{*}$ . In accordance to the discussion in Sec. IV-A, $N_{p}^{*}$ is a convex function of $\mathsf{SNR}_{0}$ , increasing arbitrarily as either $\mathsf{SNR}_{0}$ decreases (power limited regime) or increases (pilot contamination regime). It can also be seen that for all $\mathsf{SNR}_{0}$ less than about $20$ dB, $N_{a}^{*}=1$ for both values of $\alpha$ . In contrast, for increasing $\mathsf{SNR}_{0}$ , clusters of $N_{a}>1$ APs are preferable with $N_{a}^{*}$ also increasing as $\mathsf{SNR}_{0}$ increases. Similar to the $\mathsf{SNR}$ behavior discussed in the first example, the gain in pilot overhead reduction by using $N_{a}>1$ instead of $N_{a}=1$ is more prominent for greater $\alpha$ . Even with a cluster of only $N_{a}=2$ APs, over an order of magnitude smaller $N_{p}$ is required when $\mathsf{SNR}_{0}>40$ dB and $\alpha=5$ .

4) $\mathsf{SNR}$ -optimal $N_{a}$ for $N_{p}\gg 1$ and $\sigma_{w}^{2}=0$ : In the pilot contamination regime ( $\sigma_{w}^{2}=0$ ), it directly follows from (15), that, for any $N_{p}\gg 1$ , the $\mathsf{SNR}$ -optimal cluster size $N_{a}^{*}$ is the one that maximizes the term $\frac{\sigma_{\mathcal{C}}^{2}}{N_{a}(\sigma_{\Phi}^{2}-\sigma_{\mathcal{C}}^{2})}$ (note that $\sigma_{\mathcal{C}}^{2}$ depends on $N_{a}$ ). The result of this maximization is shown in Fig. 7, where the optimal $N_{a}$ is depicted as a function of the path loss factor $\alpha$ . The phase transition threshold of $\alpha=4$ identified in Proposition 7 is also verified numerically, with values of $\alpha>4$ corresponding to an arbitrarily large optimal $N_{a}$ . The optimal $N_{a}$ becomes finite for $\alpha\leq 4,$ although it is still very large for values of $\alpha$ close to $4$ , and decreases to $N_{a}=1$ as $\alpha$ becomes closer to $2$ . For the practical value of $\alpha=3.67$ , $N_{a}^{*}=8$ .

Towards reducing the physical resources dedicated to a UE, it is of interest to investigate a potential reduction of $N_{a}$ when a suboptimal $\mathsf{SNR}$ performance is acceptable. This $N_{a}$ can be obtained as the minimum $N_{a}$ satisfying

[TABLE]

for some pre-selected $0<\gamma<1$ . The result is shown in Fig. 7 for $\gamma=10^{-3/10}$ and $\gamma=10^{-10/10}$ , corresponding to a $3$ dB and $10$ dB $\mathsf{SNR}$ degradation from the optimal. For values of $\alpha$ corresponding to an extremely large $\mathsf{SNR}$ -optimal $N_{a}$ , the right-hand side of (16) was approximated as equal to the value of $\frac{\gamma\sigma_{\mathcal{C}}^{2}}{N_{a}(\sigma_{\Phi}^{2}-\sigma_{\mathcal{C}}^{2})}$ with $N_{a}=1000$ . As expected, allowing for suboptimal $\mathsf{SNR}$ performance results in a decrease of $N_{a}^{*}$ , even for $\alpha>4$ . This decrease is greater the larger the allowed $\mathsf{SNR}$ degradation is.

5) Symbol error rate performance: To illustrate the applicability of the $\mathsf{SNR}$ -based design also w.r.t. other communication-theoretic metrics, Fig. 8 shows the symbol error rate (SER) experienced at the typical UE when uncoded quadrature phase shift keying (QPSK) modulation is considered. The receiver first equalizes the received signal using the LMMSE channel estimate of (5), i.e., it computes $y_{d}/\widehat{\mathbf{1}^{T}\mathbf{h}_{\mathcal{C}}}$ , and declares the closest in Euclidean distance QPSK symbol as the transmitted one. Note that this approach is not optimal for minimization of the SER, however, it is attractive due to its simplicity. The SER is obtained by Monte Carlo simulation, over a range of moderate to large $\mathsf{SNR}_{0}$ values corresponding to operation in the pilot contamination regime. The following schemes were considered regarding cluster size $N_{a}$ and pilot sequence length $N_{p}$ : (a) $N_{a}=1$ and $N_{p}=50$ , (b) $N_{a}=1$ and $N_{p}$ equal to the value $N_{p}^{*}$ resulting in an $\mathsf{SNR}$ loss of $1$ dB compared to the perfect CSI $\mathsf{SNR}$ , and (c) for the same $N_{p}$ as case (b), the cluster size $N_{a}^{*}$ that maximizes $\mathsf{SNR}$ was selected. The path loss factor was set to $\alpha=3.67$ .

It can be seen that the first scheme results in a significant error floor due to insufficient training for compensating the pilot contamination effect in the high $\mathsf{SNR_{0}}$ regime. In contrast, adapting $N_{p}$ to $\mathsf{SNR}_{0}$ as per the second scheme eliminates this effect. Of course, a (very) large fixed value for $N_{p}$ could be considered that would eliminate the error floor as well, however, it would have to be chosen by a trial-and-error procedure and would result in unnecessary signaling overhead under small and moderate $\mathsf{SNR}_{0}$ conditions. When a cluster of more than one AP is allowed for the same $N_{p}$ , SER improves in the high $\mathsf{SNR}_{0}$ regime. For this setup, $N_{a}^{*}$ becomes greater than $1$ for $\mathsf{SNR}_{0}\approx 20$ dB and increases up to the value of $6$ at $\mathsf{SNR}_{0}=50$ dB. Finally, note that the SER performance curves have the same slope for $N_{a}=1$ and $N_{a}=N_{a}^{*}$ in the high $\mathsf{SNR}_{0}$ regime, implying that no diversity gain is achieved from the multipoint transmission [28]. This is due to the NCJT transmission scheme employed, which can only provide an $\mathsf{SNR}$ gain [9].

VI Conclusion

The performance of downlink NCJT under practical channel estimation was analytically characterized under an SG modeling framework that takes into account randomness of AP positions in dense network deployments. A worst case assumption in terms of prior information at the UE side for channel estimation purposes was considered, corresponding to minimal overhead requirements. The (spatial) average, data detection $\mathsf{SNR}$ was considered as a tractable performance metric, which was characterized by a simple (semi) closed-form expressions that allows for efficient evaluation under arbitrary system and operational parameters as well as (numerical) optimization of important system design parameters such as length of pilot sequences and number of APs jointly serving a UE. It was shown that, even though the conventional cellular network operation with a UE associated only to its closest AP is practically sufficient under perfect CSI, under practical channel estimation affected by pilot contamination, coordinated transmissions are optimal with the achieved gain more pronounced under propagation conditions with large path loss factors.

An analytical characterization of the minimum pilot sequence length required to achieve a certain portion of the perfect CSI performance was obtained, revealing that it is a convex function of the additive noise level with arbitrarily large overhead required as the noise level becomes arbitrarily small or large. In addition, an interesting phase transition phenomenon was observed, where the optimal number of cooperating APs under arbitrarily large pilot overhead is either finite or infinite, depending on whether the path loss factor is smaller or greater than $4$ , respectively.

The analysis of this paper is a first step towards understanding the effects of practical channel estimation on cellular system performance under the SG modeling framework with various open topics, including consideration of interference during the data transmission phase and more sophisticated joint transmission schemes.

Appendix A Proof of Proposition

1

Since the LMMSE estimator commutes over linear transformations [26], it follows that $\widehat{\mathbf{1}^{T}\mathbf{h}_{\mathcal{C}}}=\mathbf{1}^{T}\hat{\mathbf{h}}_{\mathcal{C}}$ , where $\hat{\mathbf{h}}_{\mathcal{C}}$ is the LMMSE estimate of $\mathbf{h}_{\mathcal{C}}$ based on $\mathbf{y}_{p}$ . Towards specifying $\hat{\mathbf{h}}_{\mathcal{C}}$ , note from (4) that $\mathbf{y}_{p}$ is a linear transformation of $\mathbf{h}_{\mathcal{C}}$ due to the application of the (known at the UE) matrix $\mathbf{P}_{\mathcal{C}}$ plus a noise term $\tilde{\mathbf{w}}_{p}\triangleq\mathbf{P}_{\bar{\mathcal{C}}}\mathbf{h}_{\bar{\mathcal{C}}}+\mathbf{w}_{p}$ . By assumption, the UE does not know $\mathbf{P}_{\bar{\mathcal{C}}}$ , $\mathbf{h}_{\bar{\mathcal{C}}}$ , and $\mathbf{w}_{p}$ , and the AP pilot sequences are independent of the AP channels. Therefore, the correlation of $\mathbf{h}_{\mathcal{C}}$ with $\tilde{\mathbf{w}}_{p}$ equals

[TABLE]

since pilot sequences and additive noise are of zero mean. It follows that the observation $\mathbf{y}_{p}$ coincides with the standard Bayesian linear model considered in estimation theory and the LMMSE estimate of $\mathbf{h}_{\mathcal{C}}$ is given as [26]

[TABLE]

where $\mathbf{R}_{\mathbf{h}_{\mathcal{C}}}\triangleq\mathbb{E}(\mathbf{h}_{\mathcal{C}}\mathbf{h}_{\mathcal{C}}^{H})$ and $\mathbf{R}_{\tilde{\mathbf{w}}_{p}}\triangleq\mathbb{E}(\tilde{\mathbf{w}}_{p}\tilde{\mathbf{w}}_{p}^{H})$ are positive definite matrices, as will be verified in the following where they are explicitly specified in terms of $\sigma_{\mathcal{C}}^{2}$ and $\sigma_{w}^{2}$ .

Focusing first on $\mathbf{R}_{\mathbf{h}_{\mathcal{C}}}$ , note that since the UE knows nothing about the AP channels of its serving APs, the elements of $\mathbf{h}_{\mathcal{C}}$ is a random ordering of the $N_{a}$ channels $\{h_{\mathbf{x}}\}_{\mathbf{x}\in\mathcal{C}}$ . Therefore, it holds $\mathbf{h}_{\mathcal{C}}=\sum_{\mathbf{x}\in\mathcal{C}}h_{\mathbf{x}}\mathbf{e}_{\mathbf{x}}$ , where $\mathbf{e}_{\mathbf{x}}$ is the unit vector of the standard basis in $\mathbb{C}^{N_{a}}$ (i.e., all-zeros vector except one element equal to $1$ ), with $\mathbf{e}_{\mathbf{x}}\neq\mathbf{e}_{\mathbf{x}^{\prime}}$ for all $\mathbf{x}\neq\mathbf{x}^{\prime}\in\mathcal{C}$ . Note that, for any $\mathbf{x}\in\mathcal{C}$ , $\mathbf{e_{\mathbf{x}}}$ can be anyone of the $N_{a}$ basis vectors with equal probability $1/N_{a}$ . Using this representation for $\mathbf{h}_{\mathcal{C}}$ , one computes

[TABLE]

where $(a)$ is due to the independence of AP channels and AP ordering, $(b)$ follows since $\mathbb{E}(h_{\mathbf{x}}h_{\mathbf{x}^{\prime}}^{*})=0$ for any $\mathbf{x}\neq\mathbf{x}^{\prime}\in\mathcal{C}$ , due to the independent and zero mean fast fading and $(c)$ is obtained by noting $\mathbb{E}(\mathbf{e}_{\mathbf{x}}\mathbf{e}_{\mathbf{x}}^{T})=(1/N_{a})\mathbf{I}_{N_{a}}$ , for all $\mathbf{x}\in\mathcal{C}$ . By the independence of the AP channels, it is easy to see that $\sigma_{\mathcal{C}}^{2}\triangleq\mathbb{E}(|\mathbf{1}^{T}\mathbf{h}_{\mathcal{C}}|^{2})=\sum_{\mathbf{x}\in\mathcal{C}}\mathbb{E}(|h_{\mathbf{x}}|^{2})$ , therefore $\mathbf{R}_{\mathbf{h}_{\mathcal{C}}}$ can be expressed as

[TABLE]

Turning to $\mathbf{R}_{\tilde{\mathbf{w}}_{p}}$ and writing $\mathbf{P}_{\bar{\mathcal{C}}}\mathbf{h}_{\bar{\mathcal{C}}}$ as $\sum_{\mathbf{x}\in\Phi\setminus\mathcal{C}}h_{\mathbf{x}}\mathbf{p}_{\mathbf{x}}$ , it holds

[TABLE]

where all steps follow by similar arguments as in the derivation of $\mathbf{R}_{\mathbf{h}_{\mathcal{C}}}$ and $(a)$ is due to $\mathbb{E}(\mathbf{p}_{\mathbf{x}}\mathbf{p}_{\mathbf{x}}^{H})=\mathbf{I}_{N_{a}}$ , for all $\mathbf{x}\in\Phi$ , by pilot design assumption. In the last equality, the fact that $\sigma_{\Phi}^{2}\triangleq\mathbb{E}\left(\left|\sum_{\mathbf{x}\in\Phi}h_{\mathbf{x}}\right|^{2}\right)=\sum_{\mathbf{x}\in\Phi}\mathbb{E}(|h_{\mathbf{x}}|^{2})$ , following by the independence of AP channels, has been used. Its value can be obtained as

[TABLE]

where the last equality follows by application of Campbel’s theorem [19]. Substituting $\ell(\cdot)$ with its expression given in (2) and evaluating the integral results in the closed form expression of (6). The result of the proposition now follows after substituting (18) and (19) in (17) and some trivial algebra.

Appendix B Proof of Proposition

3

Let $\rho_{s}>0$ denote the distance from the typical UE of the $s$ -th closest AP whose probability density function equals [29]

[TABLE]

As discussed in the proof of Prop. 1, the average NCJT channel energy equals $\sigma_{\mathcal{C}}^{2}=\sum_{\mathbf{x}\in\mathcal{C}}\mathbb{E}(|h_{\mathbf{x}}|^{2})$ . It holds

[TABLE]

where $(a)$ follows by fundamental properties of the HPPP, which state that, conditioned on $\rho_{N_{a}+1}$ , the locations of the $N_{a}$ serving APs are independent and uniformly distributed over the disk centered at the origin and of radius $\rho_{N_{a}+1}$ [20], and $(b)$ by changing the order of integration. Substituting (20) into (21) results in (9) after some straightforward algebraic manipulations.

An approximate closed-form expression for (21) can be obtained as follows. It can be shown by direct computation that $\mathbb{E}(\rho_{N_{a}})=\sqrt{N_{a}/(\pi\lambda)}+\mathcal{O}(\sqrt{1/N_{a}}),N_{a}\rightarrow\infty$ , whereas the variance of $\rho_{N_{a}}$ equals $1/(4\pi\lambda)+\mathcal{O}(1/N_{a}),N_{a}\rightarrow\infty$ . This suggests that $p(\rho_{N_{a}})$ is highly concentrated around its mean for asymptotically large $N_{a}$ , suggesting the approximation $p(\rho_{N_{a}})\approx\delta(\rho_{N_{a}}-\sqrt{N_{a}/(\pi\lambda)})$ in that regime, with $\delta(\cdot)$ denoting the Dirac delta. Substituting this approximation in (21) results in (10).

Appendix C Proof of Proposition

4

Since $\widehat{\mathbf{1}^{T}\mathbf{h}_{\mathcal{C}}}=\mathbf{1}^{T}\hat{\mathbf{h}}_{\mathcal{C}}$ , it holds

[TABLE]

where $\mathbf{R}_{\mathbf{h}_{\mathcal{C}}-\mathbf{\hat{h}}_{\mathcal{C}}}\triangleq\mathbb{E}\left((\mathbf{h}_{\mathcal{C}}-\mathbf{\hat{h}}_{\mathcal{C}})(\mathbf{h}_{\mathcal{C}}-\mathbf{\hat{h}}_{\mathcal{C}})^{H}\right)$ is the covariance matrix of the zero mean vector $\mathbf{h}_{\mathcal{C}}-\mathbf{\hat{h}}_{\mathcal{C}}$ . With $\mathbf{R}_{\mathbf{h}_{\mathcal{C}}}$ , $\mathbf{R}_{\tilde{\mathbf{w}}_{p}}$ as defined in Appendix A, $\mathbf{R}_{\mathbf{h}_{\mathcal{C}}-\mathbf{\hat{h}}_{\mathcal{C}}}$ equals [26]

[TABLE]

where the second equality follows from (18) and (19). Substituting (23) into (22) gives $\sigma_{e}^{2}$ , which, as expected, is a function of the AP cluster pilot sequences contained in $\mathbf{P}_{\mathcal{C}}$ . However, a simpler approximate expression, independent of $\mathbf{P}_{\mathcal{C}}$ can be obtained as follows. With increasing $N_{p}$ , $\mathbf{p}_{\mathbf{x}}^{H}\mathbf{p}_{\mathbf{x}^{\prime}}\rightarrow 0$ , for all $\mathbf{x}\neq\mathbf{x}^{\prime}\in\mathcal{C},$ suggesting that $\mathbf{P}_{\mathcal{C}}^{H}\mathbf{P}_{\mathcal{C}}$ is approximately diagonal for large $N_{p}$ . This, in turn, implies that $\mathbf{R}_{\mathbf{h}_{\mathcal{C}}-\mathbf{\hat{h}}_{\mathcal{C}}}$ is approximately diagonal for large $N_{p}$ and taking this into account in (22) results in

[TABLE]

For $N_{a},N_{p}\rightarrow\infty$ , with the ratio $N_{a}/N_{p}$ constant, this expression converges to the limit given in (11) for any realization of the AP pilot sequences [30, Eq. (1.16)].

Appendix D Proof of Proposition

5

Treating $N_{p}$ as a real number, its minimum required value is obtained by solving the non-linear equation $\mathsf{SNR}=\gamma\sigma_{\mathcal{C}}^{2}/\sigma_{w}^{2}$ w.r.t. $N_{p}$ . Assume that the minimum solution $N_{p}^{*}$ of this equation is much greater than $1$ . Then, this $N_{p}^{*}$ can be obtained approximately by replacing $\mathsf{SNR}$ in the above equation with its expression given in (8) using the large- $N_{p}$ approximation of $\sigma_{e}^{2}$ given in (11). Solving the resulting equation w.r.t. $N_{p}$ gives the solution of (13). However, the formula of (13) should be used with caution since, for a given set of system parameters appearing in the right hand side of (13), it may happen that the approximate $N_{p}^{*}$ is small or even negative, which contradicts the assumption of large $N_{p}^{*}$ based on which (13) was derived. Therefore, conditions resulting in a large approximate $N_{p}^{*}$ as per (13) that is a good approximation of the true (and large) $N_{p}^{*}$ must be found.

One approach to identify such conditions is to obtain a lower bound for the right-hand side of (13) and identify the conditions which guarantee that this bound is large. The resulting conditions will clearly be sufficient although they may not be necessary. Let $g$ denote the right-hand side expression of (13). It is easy to see that $g$ is a convex function of $\sigma_{w}^{2}$ with $\min_{\sigma_{w}^{2}}g=\frac{\gamma N_{a}\sigma_{\Phi}^{2}}{(1-\gamma)\sigma_{\mathcal{C}}^{2}}$ , achieved at $\sigma_{w}^{2}=\sigma_{\mathcal{C}}(\sigma_{\Phi}-\sigma_{\mathcal{C}})$ . Noting that it holds $\sigma_{\mathcal{C}}^{2}\leq\sigma_{\Phi}^{2}$ , it follows that $\min_{\sigma_{\mathcal{C}}^{2},\sigma_{w}^{2}}g=\frac{\gamma N_{a}}{(1-\gamma)}$ resulting in the sufficient condition of (12).

Appendix E Proof of Proposition 7

By setting $\sigma_{w}^{2}=0$ in (8) and using (11), it holds

[TABLE]

where $(a)$ follows by noting that $b/f_{a}(b)=b+a(1-b)+\mathcal{O}(a^{2}),a\rightarrow 0$ . For $N_{a}\gg 1$ , the approximation of (10) for $\sigma_{\mathcal{C}}^{2}$ can be employed in (24) resulting in

[TABLE]

It is easy to see that for $\alpha>4,$ $\mathsf{SNR}$ is increasing with increasing $N_{a}$ (assuming always that $N_{p}\gg N_{a})$ , therefore, an arbitrarily large $N_{a}$ is $\mathsf{SNR}$ -optimal in this regime, whereas $\mathsf{SNR}$ is decreasing with increasing $N_{a}$ for $\alpha\leq 4$ , thus the $\mathsf{SNR}$ -optimal $N_{a}$ must be finite in this regime.

Bibliography30

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] N. Bhushan et al ., “Network densification: The dominant theme for wireless evolution into 5G,” IEEE Commun. Mag ., vol. 52, no. 2, pp. 82–89, Feb. 2014.
2[2] J. Li, A. Papadogiannis, R. Apelfröjd, T. Svensson, and M. Sternad, “Performance evaluation of coordinated multi-point transmission schemes with predicted CSI,” in IEEE Intl. Symposium on Personal Indoor and Mobile Radio Commun. (PIMRC) , 2012, pp. 1055–1060.
3[3] B. Hassibi and B. M. Hochwald, “How much training is needed in multiple-antenna wireless links?,” IEEE Trans. Inf. Theory , vol. 49, no. 4, pp. 951–963, Apr. 2003.
4[4] T. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station antennas,” IEEE Trans. Wireless Commun ., vol. 9, no. 11, pp. 3590–3600, Sep. 2010.
5[5] D. Gesbert, S. Hanly, H. Huang, S. S. Shitz, O. Simeone, and W. Yu, “Multi-cell MIMO cooperative networks: A new look at interference,” IEEE J. Sel. Areas Commun ., vol. 28, no. 9, pp. 1380–1408, Dec. 2010.
6[6] H. El Sawy, A. Sultan-Salem, M. S. Alouini, and M. Z. Win, “Modeling and analysis of cellular networks using stochastic geometry: A tutorial,” IEEE Commun. Surveys and Tutorials, vol. 19, no. 1, pp. 167–203, First quarter 2017.
7[7] J. G. Andrews, F. Baccelli, and R. K. Ganti, “A tractable approach to coverage and rate in cellular networks,” IEEE Trans. Commun ., vol. 59, no. 11, pp. 3122–3134, Nov. 2011.
8[8] H. El Sawy and E. Hossain, “On stochastic geometry modeling of cellular uplink transmission with truncated channel inversion power control,” IEEE Trans. Wireless Commun ., vol. 13, no. 8, pp. 4454–4469, Aug. 2014.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Non-Coherent Joint Transmission in Poisson Cellular Networks Under

Abstract

Index Terms:

I Introduction

I-A Previous Work

I-B Contributions

I-C Notation

II System Model

III LMMSE Channel Estimation and Data Detection SNR\mathsf{SNR}SNR

III-A LMMSE Channel Estimation

Assumption**.**

Proposition 1**.**

Proof:

Remark 2*.*

III-B Data Detection SNR\mathsf{SNR}SNR

III-C Computation of NJCT Channel Energy

Proposition 3**.**

Proof:

III-D *Computation of LMMSE

Proposition 4**.**

Proof:

IV On the Optimal Pilot Sequence Length and AP Cluster Size

IV-A *Minimum Required Pilot

Proposition 5**.**

Proof:

Corollary 6**.**

IV-B Optimal Cluster Size with Arbitrarily Large Training Overhead

Proposition 7**.**

Proof:

Remark 8*.*

V Numerical Examples

VI Conclusion

Appendix A Proof of Proposition

Appendix B Proof of Proposition

Appendix C Proof of Proposition

Appendix D Proof of Proposition

Appendix E Proof of Proposition 7

III LMMSE Channel Estimation and Data Detection $\mathsf{SNR}$

Assumption.

Proposition 1.

*Remark 2**.*

III-B Data Detection $\mathsf{SNR}$

Proposition 3.

Proposition 4.

Proposition 5.

Corollary 6.

Proposition 7.

*Remark 8**.*