Periodic Analog Channel Estimation Aided Beamforming for Massive MIMO   Systems

Vishnu V. Ratnam; Andreas F. Molisch

arXiv:1901.04078·cs.IT·January 15, 2019

Periodic Analog Channel Estimation Aided Beamforming for Massive MIMO Systems

Vishnu V. Ratnam, Andreas F. Molisch

PDF

TL;DR

This paper introduces PACE, a novel analog channel estimation technique for massive MIMO systems that reduces overhead by avoiding digital processing, enabling efficient beamforming especially in sparse, wide-band channels.

Contribution

The paper proposes a new periodic analog channel estimation method (PACE) that operates with analog hardware, significantly lowering estimation overhead compared to digital methods.

Findings

01

PACE achieves comparable beamforming gain with lower overhead.

02

In sparse channels, PACE performs well above certain SNR levels.

03

The method is effective during initial access phases.

Abstract

Analog beamforming is an attractive and cost-effective solution to exploit the benefits of massive multiple-input-multiple-output systems, by requiring only one up/down-conversion chain. However, the presence of only one chain imposes a significant overhead in estimating the channel state information required for beamforming, when conventional digital channel estimation (CE) approaches are used. As an alternative, this paper proposes a novel CE technique, called periodic analog CE (PACE), that can be performed by analog hardware. By avoiding digital processing, the estimation overhead is significantly lowered and does not scale with number of antennas. PACE involves periodic transmission of a sinusoidal reference signal by the transmitter, estimation of its amplitude and phase at each receive antenna via analog hardware, and using these estimates for beamforming. To enable such…

Tables1

Table 1. TABLE I: One PLL and weighted arraying simulation parameters

Parameter	Value	Parameter	Value
$f_{c}$	$30$ GHz	$ϵ$	$4 / T_{s}$
$f_{c} - f_{vco}$	$5$ MHz	$f_{IF}$	$1$ GHz
$T_{s}$	$1 μ s$	$f_{c} - f_{IF} - f_{vco}^{p}$	$5$ MHz
$T_{cp}$	$0.1 μ$ s	$ℳ$	${1, 5, 15}$
$K_{1}$	$512$	$μ$	$2 π / T_{s}$
$K_{2}$	$511$	$G^{p} {\| A_{rss}^{(r)} \|}^{2} / μ$	$π \| f_{c} - f_{IF} - f_{vco}^{p} \|$
$G \| A_{1}^{(r)} \|$	$π \| f_{c} - f_{vco} \|$	$ϵ^{p}$	$4 / T_{s}$

Equations82

\tilde{s}_{tx}^{(r)} (t)

\tilde{s}_{tx}^{(r)} (t)

\tilde{s}_{tx}^{(d)} (t)

H (t) = ℓ = 0 \sum L - 1 α_{ℓ} a_{rx} (ℓ) a_{tx} (ℓ)^{†} δ (t - τ_{ℓ}),

H (t) = ℓ = 0 \sum L - 1 α_{ℓ} a_{rx} (ℓ) a_{tx} (ℓ)^{†} δ (t - τ_{ℓ}),

\overset{ˉ}{a}_{rx} (ψ_{azi}^{rx}, ψ_{ele}^{rx}) ≜

\overset{ˉ}{a}_{rx} (ψ_{azi}^{rx}, ψ_{ele}^{rx}) ≜

\displaystyle\quad\left[\begin{array}[]{c}1\\ e^{{\rm j}2\pi\frac{\Delta_{\rm H}\sin(\psi^{\rm rx}_{\rm azi})\sin(\psi^{\rm rx}_{\rm ele})}{\lambda}}\\ \ldots\\ e^{{\rm j}2\pi\frac{\Delta_{\rm H}(M_{\rm H}-1)\sin(\psi^{\rm rx}_{\rm azi})\sin(\psi^{\rm rx}_{\rm ele})}{\lambda}}\end{array}\right]\otimes\left[\begin{array}[]{c}1\\ e^{{\rm j}2\pi\frac{\Delta_{\rm V}\cos(\psi^{\rm rx}_{\rm ele})}{\lambda}}\\ \ldots\\ e^{{\rm j}2\pi\frac{\Delta_{\rm V}(M_{\rm V}-1)\cos(\psi^{\rm rx}_{\rm ele})}{\lambda}}\end{array}\right],\!\!\!\!\!\!\!

\tilde{s}_{rx}^{(\cdot)} (t) = ℓ = 0 \sum L - 1 α_{ℓ} a_{rx} (ℓ) a_{tx} (ℓ)^{†} \tilde{s}_{tx}^{(\cdot)} (t - τ_{ℓ}) + 2 \tilde{w}^{(\cdot)} (t) e^{j 2 π f_{c} t}

\tilde{s}_{rx}^{(\cdot)} (t) = ℓ = 0 \sum L - 1 α_{ℓ} a_{rx} (ℓ) a_{tx} (ℓ)^{†} \tilde{s}_{tx}^{(\cdot)} (t - τ_{ℓ}) + 2 \tilde{w}^{(\cdot)} (t) e^{j 2 π f_{c} t}

\tilde{s}_{rx, m}^{(r)} (t)

\tilde{s}_{rx, m}^{(r)} (t)

s_{PLL} (t) = s_{vco} (t) = 2 cos [2 π f_{c} t + \overset{ˉ}{θ} + θ (t)]

s_{PLL} (t) = s_{vco} (t) = 2 cos [2 π f_{c} t + \overset{ˉ}{θ} + θ (t)]

\displaystyle 2\pi f_{\rm c}\!+\!\frac{{\rm d}\theta(t)}{{\rm d}t}=\text{LF}\left\{\mathrm{Re}\{\tilde{s}_{{\rm rx},1}(t)\}\sqrt{2}\cos\big{[}2\pi f_{\rm c}t+\bar{\theta}+\theta(t)\big{]}\right\}G

\displaystyle 2\pi f_{\rm c}\!+\!\frac{{\rm d}\theta(t)}{{\rm d}t}=\text{LF}\left\{\mathrm{Re}\{\tilde{s}_{{\rm rx},1}(t)\}\sqrt{2}\cos\big{[}2\pi f_{\rm c}t+\bar{\theta}+\theta(t)\big{]}\right\}G

+ 2 π f_{vco}

\displaystyle\quad=\mathrm{LF}\Big{\{}\mathrm{Re}\big{[}A_{1}^{(\rm r)}e^{-{\rm j}[\bar{\theta}+\theta(t)]}+\tilde{w}^{(\rm r)}_{1}(t)e^{-{\rm j}[\bar{\theta}+\theta(t)]}\big{]}\Big{\}}G+2\pi f_{\rm vco}\!\!\!\!\!\!\!\!

\frac{d θ _{L} ( t )}{d t}

\frac{d θ _{L} ( t )}{d t}

s Θ_{L} (s)

s Θ_{L} (s)

S_{θ_{L}} (f) = E ∣ Θ_{L} (j 2 π f) ∣^{2}

S_{θ_{L}} (f) = E ∣ Θ_{L} (j 2 π f) ∣^{2}

\displaystyle\qquad\ \ =\frac{{|G|}^{2}(4\pi^{2}f^{2}+\epsilon^{2})\mathcal{S}_{\rm w}(f)}{2{\big{|}-4\pi^{2}f^{2}+G({\rm j}2\pi f+\epsilon)|A_{1}^{(\rm r)}|\big{|}}^{2}}

R_{θ_{L}} (τ) = \int_{- \infty}^{\infty} S_{θ_{L}} (f) e^{j 2 π f t} d t

\displaystyle\qquad\ \ \approx\frac{{|G|}^{2}\mathrm{N}_{0}}{4}\Big{[}\frac{a^{2}-\epsilon^{2}}{a(a^{2}-b^{2})}e^{-a|t|}+\frac{b^{2}-\epsilon^{2}}{b(b^{2}-a^{2})}e^{-b|t|}\Big{]}\!\!\!\!\!\!\!\!

Var {θ_{L} (t)} = R_{θ_{L}} (0) \leq N_{0} \frac{∣ A _{1}^{(r)} ∣ G + ϵ}{4 ∣ A _{1}^{(r)} ∣ ^{2}},

\tilde{s}_{PLL} (t) = 2 e^{j [2 π f_{c} t + \overset{ˉ}{θ} + θ (t)]} .

\tilde{s}_{PLL} (t) = 2 e^{j [2 π f_{c} t + \overset{ˉ}{θ} + θ (t)]} .

I_{PACE} \approx \frac{1}{D _{2}} \int_{T_{1}}^{T_{2}} Re {\tilde{s}_{rx}^{(r)} (t)} \tilde{s}_{PLL}^{*} (t) d t

I_{PACE} \approx \frac{1}{D _{2}} \int_{T_{1}}^{T_{2}} Re {\tilde{s}_{rx}^{(r)} (t)} \tilde{s}_{PLL}^{*} (t) d t

= \frac{1}{D _{2}} \int_{T_{1}}^{T_{2}} [\frac{1}{T _{cs}} \hat{H} (0) t E^{(r)} e^{- j [\overset{ˉ}{θ} + θ (t)]} + \hat{w}^{(r)} (t)] d t,

I_{PACE}

I_{PACE}

s_{vco}^{p} (t)

s_{vco}^{p} (t)

s_{vco, m}^{s} (t)

\frac{d ϕ _{m} ( t )}{d t}

\frac{d ϕ _{m} ( t )}{d t}

2 π (f_{c} - f_{IF}) + \frac{d θ ( t )}{d t}

2 π (f_{c} - f_{IF}) + \frac{d θ ( t )}{d t}

\displaystyle s\Phi^{\rm L}_{m}(s)=\Big{(}-|A_{m}^{(\rm r)}|[\Phi^{\rm L}_{m}(s)+\Theta_{\rm L}(s)]

\displaystyle s\Phi^{\rm L}_{m}(s)=\Big{(}-|A_{m}^{(\rm r)}|[\Phi^{\rm L}_{m}(s)+\Theta_{\rm L}(s)]

\displaystyle\qquad\qquad\qquad\qquad+\frac{\hat{\mathcal{W}}_{m}^{(\rm r)}(s)+{[\hat{\mathcal{W}}_{m}^{(\rm r)}(s^{*})]}^{*}}{2}\Big{)}\frac{G^{\rm s}_{m}}{\sqrt{2}}

\displaystyle s\Theta_{\rm L}(s)=\mathrm{LF}(s)\!\!\sum_{m\in\mathcal{M}}\!\!\Big{[}\!\!-\!\frac{|A_{m}^{(\rm r)}|}{G_{m}^{\rm s}}[\Phi^{\rm L}_{m}(s)\!+\!\Theta_{\rm L}(s)]

\displaystyle\qquad\quad\!+\!\frac{\hat{\mathcal{W}}_{m}^{(\rm r)}(s)\!+\!{[\hat{\mathcal{W}}_{m}^{(\rm r)}(s^{*})]}^{*}}{2G_{m}^{\rm s}}\Big{]}\frac{{G}^{\rm p}}{\sqrt{2}}\!+\!\frac{2\pi(f_{\rm IF}\!+\!f^{\rm p}_{\rm vco}\!-\!f_{\rm c})}{s}\!\!\!\!\!\!\!\!\!\!

\displaystyle\bigg{[}s+\sum_{m\in\mathcal{M}}\frac{(s+\epsilon^{\rm p}){|A_{m}^{(\rm r)}|}^{2}G^{\rm p}}{\mu(\sqrt{2}s+\mu)}\bigg{]}\Theta_{\rm L}(s)

\displaystyle\bigg{[}s+\sum_{m\in\mathcal{M}}\frac{(s+\epsilon^{\rm p}){|A_{m}^{(\rm r)}|}^{2}G^{\rm p}}{\mu(\sqrt{2}s+\mu)}\bigg{]}\Theta_{\rm L}(s)

= m \in M \sum \frac{( s + ϵ ^{p} ) ( W ^ _{m}^{(r)} ( s ) + [ W ^ _{m}^{(r)} ( s ^{*} )] ^{*} ) ∣ A _{m}^{(r)} ∣ G ^{p}}{2 μ ( 2 s + μ )}

+ \frac{2 π ( f _{IF} + f _{vco}^{p} - f _{c} )}{s}

S_{θ_{L} - \overset{ˉ}{θ}_{L}} (f)

S_{θ_{L} - \overset{ˉ}{θ}_{L}} (f)

= \frac{N _{0} ∣ A _{rss}^{(r)} G ^{p} ∣ ^{2}}{2} \frac{( s + ϵ ^{p} )}{s μ ( 2 s + μ ) + ( s + ϵ ^{p} ) ∣ A _{rss}^{(r)} ∣ ^{2} G ^{p}}_{s = j2 π f}^{2}

Var {θ_{L} (t)} = \frac{( ∣ A _{rss}^{(r)} ∣ ^{2} [ G ^{p} / μ ] + 2 ϵ ^{p} ) [ G ^{p} / μ ] N _{0}}{4 2 ( μ + ∣ A _{rss}^{(r)} ∣ ^{2} [ G ^{p} / μ ])}

\leq \frac{( ∣ A _{rss}^{(r)} ∣ ^{2} [ G ^{p} / 2 μ ] + ϵ ^{p} ) N _{0}}{4 ∣ A _{rss}^{(r)} ∣ ^{2}},

\displaystyle R(t)=\frac{1}{\sqrt{2}}\mathbf{I}_{\rm PACE}^{{\dagger}}\bigg{[}\sum_{\ell=0}^{L-1}\sum_{k\in\mathcal{K}}\sqrt{\frac{2}{T_{\rm cs}}}\alpha_{\ell}\mathbf{a}_{\rm rx}(\ell){\mathbf{a}_{\rm tx}(\ell)}^{{\dagger}}\mathbf{t}x_{k}e^{{\rm j}2\pi(f_{\rm c}+f_{k})(t-\tau_{\ell})}

\displaystyle R(t)=\frac{1}{\sqrt{2}}\mathbf{I}_{\rm PACE}^{{\dagger}}\bigg{[}\sum_{\ell=0}^{L-1}\sum_{k\in\mathcal{K}}\sqrt{\frac{2}{T_{\rm cs}}}\alpha_{\ell}\mathbf{a}_{\rm rx}(\ell){\mathbf{a}_{\rm tx}(\ell)}^{{\dagger}}\mathbf{t}x_{k}e^{{\rm j}2\pi(f_{\rm c}+f_{k})(t-\tau_{\ell})}

\displaystyle\qquad\qquad\qquad+\sqrt{2}\tilde{\mathbf{w}}^{(\rm d)}(t)e^{{\rm j}2\pi f_{\rm c}t}\bigg{]}

Y_{k}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Periodic Analog Channel Estimation Aided Beamforming for Massive MIMO Systems

Vishnu V. Ratnam, and Andreas F. Molisch V. V. Ratnam and A. F. Molisch are with the Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA, 90089 USA (e-mail: {ratnam, molisch}@usc.edu). This work was supported by the National Science Foundation under project CIF-1618078.

Abstract

Analog beamforming is an attractive and cost-effective solution to exploit the benefits of massive multiple-input-multiple-output systems, by requiring only one up/down-conversion chain. However, the presence of only one chain imposes a significant overhead in estimating the channel state information required for beamforming, when conventional digital channel estimation (CE) approaches are used. As an alternative, this paper proposes a novel CE technique, called periodic analog CE (PACE), that can be performed by analog hardware. By avoiding digital processing, the estimation overhead is significantly lowered and does not scale with number of antennas. PACE involves periodic transmission of a sinusoidal reference signal by the transmitter, estimation of its amplitude and phase at each receive antenna via analog hardware, and using these estimates for beamforming. To enable such non-trivial operation, two reference tone recovery techniques and a novel receiver architecture for PACE are proposed and analyzed, both theoretically and via simulations. Results suggest that in sparse, wide-band channels and above a certain signal-to-noise ratio, PACE aided beamforming suffers only a small loss in beamforming gain and enjoys a much lower CE overhead, in comparison to conventional approaches. Benefits of using PACE aided beamforming during the initial access phase are also discussed.

Index Terms:

Hybrid beamforming, analog beamforming, massive MIMO, channel estimation, analog channel estimation, initial access, carrier recovery, carrier arraying.

I Introduction

Massive Multiple-input-multiple-output (MIMO) systems, enabled by using antenna arrays with many elements at the transmitter (TX) and/or receiver (RX), promise large beamforming gains and improved spectral efficiency, and are therefore a key focus area for 5G systems research and development [1, 2]. Such massive antenna arrays, while also beneficial at sub- $6$ GHz frequencies, are essential at the higher millimeter-wave (mm-wave) frequencies to compensate for the large channel attenuation. However, despite their numerous benefits, full complexity massive MIMO architectures suffer from increased hardware cost and energy consumption. This is because, though the antenna elements are affordable, the corresponding up/down-conversion chains - which include circuit components such as analog-to-digital converters and digital-to-analog converters - are both expensive and power hungry [3]. A popular solution to reduce this implementation cost is hybrid beamforming [4, 5], where the large antenna array is connected to a small number of up/down-conversion chains via power-efficient and cost-effective analog hardware, such as, phase-shifters. By using such analog hardware to focus power into the dominant channel directions, hybrid beamforming exploits the directional nature of wireless channels to minimize loss in system performance. In this paper, we focus on a special case of hybrid beamforming with one up/down-conversion chain (for the in-phase and quadrature-phase components each), referred to as analog beamforming.

A major challenge with analog beamforming (and also hybrid beamforming in general) is the acquisition of the channel state information (CSI) required for beamforming at the TX and RX. In narrow-band (i.e., frequency-flat fading) systems [6, 7, 8, 9, 10, 11, 12], the required CSI usually involves instantaneous channel parameters (iCSI), while in wide-band systems [13, 14, 15, 16, 17, 18] average channel parameters (aCSI) are used for designing the analog beamformer. Here aCSI refers to channel parameters that remain constant over a wide time-frequency range, such as the spatial correlation matrices, while iCSI are parameters that change faster. In either scenario, the required CSI can be obtained by transmitting known signals (pilots) and performing channel estimation (CE) at the RX within each CSI coherence time, i.e., period over which CSI remains constant.111Required CSI at the TX is obtained either via CE on the reverse link, or via CSI feedback from RX. Since all RX antennas share one down-conversion chain, multiple temporal pilot transmissions are required for performing such CE [19, 20, 21, 16]. As an illustration, exhaustive CE approaches [20] require ${\rm O}(M_{\rm tx}M_{\rm rx})$ pilots, where $M_{\rm tx},M_{\rm rx}$ are the number of TX and RX antennas, respectively and ${\rm O}(\cdot)$ represents the scaling behavior in big oh notation. Such a large pilot overhead may consume a significant portion of the time-frequency resources when the CSI coherence time is short, such as in vehicle-to-vehicle channels, in systems using narrow TX/RX beams, e.g., massive MIMO, or in channels with large carrier frequencies and high blocking probabilities, e.g., at mm-wave frequencies [22]. The overhead also increases system latency and makes the initial access222Initial access refers to the phase wherein, a user equipment and base-station discover each other, synchronize, and coordinate to initiate communication. (IA) procedure very cumbersome [23, 24, 25]. Several fast CE approaches have therefore been suggested to reduce the pilot overhead, which are discussed below assuming $M_{\rm tx}=1$ for convenience.333For $M_{\rm tx}>1$ , the pilot overhead increases further, either multiplicatively or additively, by a function of $M_{\rm tx}$ , determined by the CE algorithm used at the TX. Side information aided narrow-band CE approaches utilize channel statistics and temporal correlation to reduce the iCSI pilot overhead [26, 27, 12, 21]. Compressed sensing based approaches [28, 19, 29, 30] exploit the sparse nature of the massive MIMO channels to reduce the pilots up to ${\rm O}[L\log(M_{\rm rx}/L)]$ per CSI coherence time, where $L$ is the channel sparsity level. Iterative angular domain CE performs beam sweeping at the RX with progressively narrower search beams to find a good beam direction with ${\rm O}(\log M_{\rm rx})$ pilots [31, 32, 25]. Approaches that utilize side information to improve iterative angular domain CE [33, 34] or perform angle domain tracking [35, 36] have also been considered. Sparse ruler based approaches exploit the possible Toeplitz structure of the spatial correlation matrix to reduce pilots to ${\rm O}(\sqrt{M_{\rm rx}})$ per CSI coherence time [37, 38, 39, 40, 16]. Since the overhead still scales with $M_{\rm rx}$ \footrefnote1, these approaches are only partially successful in reducing the pilot overhead. Furthermore, some of these CE approaches may not be applicable for IA since they would require the timing and frequency synchronization [41, 42] to be performed without the TX/RX beamforming gain, which may be difficult at the low signal-to-noise ratio (SNR) and high phase noise (i.e., random fluctuations of the instantaneous oscillator frequency) levels expected in mm-wave systems. Some of these CE approaches also require the channel to remain static during the re-transmissions and are only applicable for certain antenna configurations and/or channel models. Finally, to reduce the impact of the transient effects of analog hardware on CE [43], the multiple pilots may have to be spaced sufficiently far apart[44], thus potentially increasing the latency.

The main reason for the pilot overhead is that conventional CE approaches require processing in the digital domain, thus having to time-share the down-conversion chain across the antennas. Inspired by ultra-wideband transmit reference schemes [45, 46, 47] and legacy adaptive antenna array techniques [48, 49, 50, 51], our recent conference papers [52, 53], explore a different novel approach that enables CE without digital processing. In this approach, the TX transmits a reference sinusoidal tone simultaneously with the data. The received reference signals (including both amplitude and phase) are then recovered at each RX antenna via analog hardware and are utilized as a homodyne combining filter for the data. In essence, [52, 53] show that a maximal ratio combining (MRC) beamformer built for a reference frequency also provides a good, albeit sub-optimal, beamforming gain at other frequencies in a sparse scattering, wide-band channel. This is because, although they experience frequency selective fading, such channels exhibit a strong coupling across frequency. Since recovering a reference sinusoidal signal, or equivalently estimating its amplitude and phase, is significantly simpler than conventional CE, it can be performed at each RX antenna by analog hardware such as phase locked loops. Thus, by avoiding digital CE, this scheme allows RX beamforming without pilot re-transmissions. We shall henceforth refer to this type of amplitude and phase estimation as analog channel estimation (ACE). Note that due to the limited capabilities of analog hardware and the low SNR before beamforming, performing ACE and exploring new ACE techniques is non-trivial. In the original design in [52], the reference has to be transmitted continuously, to enable its recovery at the RX. While this design reduces the estimation overhead and avoids phase-shifters, it requires $M_{\rm rx}$ carrier recovery circuits which may add to the cost and power consumption of the RX. Furthermore, the continuous recovery of the reference tone is an overkill, and may cause some wastage in the transmit power and spectral efficiency. In [53], a non-coherent variant of [52] is explored that avoids recovery circuits but at the expense of $50\%$ bandwidth efficiency reduction. The current paper therefore proposes a different ACE scheme, referred to as periodic ACE (PACE), where the reference is transmitted judiciously, and its amplitude and phase are explicitly estimated to drive an RX phase shifter array. Unlike [52], PACE requires one carrier recovery circuit and $M_{\rm rx}$ phase shifters (see Fig. 1) and can support both homo/heter-dyne reception.

In PACE, the TX transmits a reference tone at a known frequency during each periodic RX beamformer update phase. One carrier recovery circuit, involving phase-locked loops (PLLs), is used to recover the reference tone from one or more antennas, as shown in Fig. 1. This recovered reference tone, and its quadrature component, are then used to estimate the phase off-set and amplitude of the received reference tone at each RX antenna, via a bank of ‘filter, sample and hold’ circuits (represented as integrators in Fig. 1). As shall be shown, these estimates are proportional to the channel response at the reference frequency. These estimates are used to control an array of variable gain phase-shifters, which generate the RX analog beam. During the data transmission phase, the wide-band received data signals pass through these phase-shifters, are summed and processed similar to conventional analog beamforming. As the phase and amplitude estimation is done in the analog domain, ${\rm O}(1)$ pilots are sufficient to update the RX beamformer. Additionally, the power from multiple channel MPCs is accumulated by this approach, increasing the system diversity against MPC blocking. Furthermore, the same variable gain phase-shifts can also be used for transmit beamforming on the reverse link. Finally, by providing an option for digitally controlling the inputs to the phase-shifters, the proposed architecture can also support conventional beamforming approaches.

On the flip side, PACE requires some additional analog hardware components, such as mixers and filters, in comparison to conventional digital CE. Additionally, the accumulation of power from multiple MPCs may cause frequency selective fading in a wide-band scenario, which can degrade performance. Finally, the proposed approach in its current suggested form does not support reception of multiple spatial data streams and can only be used for beamforming at one end of a communication link. This architecture is therefore more suitable for use at the user equipment (UEs). The possible extensions to multiple spatial stream reception shall be explored in future work. While the proposed architecture is also applicable in narrow-band scenarios, in this paper we shall focus on the analysis of a wide-band scenario where the repetition interval of PACE and beamformer update is of the order of aCSI coherence time, i.e. time over which the aCSI stays approximately constant (also called stationarity time in some literature).

The contributions of this paper are as follows:

We propose a novel transmission technique, namely PACE, and a corresponding RX architecture that enable RX analog beamforming with low CE overhead. 2. 2.

To enable the RX operation, we also explore two novel reference recovery circuits. These circuits are non-linear, making their analysis non-trivial. We provide an approximate analysis of their phase-noise and the resulting performance that is tight in the high SNR regime. 3. 3.

We analytically characterize the achievable system throughput with PACE aided beamforming in a wide-band channel. 4. 4.

Simulations with practically relevant channel models are used to support the analytical results and compare performance to existing schemes.

The organization of the paper is as follows: the system model is presented Section II; two designs for PACE and their respective noise analysis is presented in Section III; the system performance with PACE aided beamforming is characterized in Section IV; the advantages of PACE for transmit beamforming and during the IA phase are discussed in Section V; simulations results are presented in Section VI and finally conclusions are in Section VII.

Notation: scalars are represented by light-case letters; vectors by bold-case letters; and sets by calligraphic letters. Additionally, ${\rm j}=\sqrt{-1}$ , $a^{*}$ is the complex conjugate of a complex scalar $a$ , $|\mathbf{a}|$ represents the $\ell_{2}$ -norm of a vector $\mathbf{a}$ and ${\mathbf{A}}^{{\dagger}}$ is the conjugate transpose of a complex matrix $\mathbf{A}$ . Finally, $\mathbb{E}\{\}$ represents the expectation operator, $\otimes$ represents the Kronecker product, $\stackrel{{\scriptstyle\rm d}}{{=}}$ represents equality in distribution, $\mathrm{Re}\{\cdot\}$ / $\mathrm{Im}\{\cdot\}$ refer to the real/imaginary component, respectively, $\mathcal{CN}(\mathbf{a},\mathbf{B})$ represents a circularly symmetric complex Gaussian vector with mean $\mathbf{a}$ and covariance matrix $\mathbf{B}$ , $\mathrm{Exp}\{a\}$ represents an exponential distribution with mean $a$ and $\mathrm{Uni}\{a,b\}$ represents a uniform distribution in range $[a,b]$ .

II General Assumptions and System model

We consider the downlink of a single-cell MIMO system, wherein one base station (BS) with $M_{\rm tx}$ antennas transmits to several UEs with $M_{\rm rx}$ antennas each. Since focus is on the downlink, we shall use abbreviations BS & TX and UE & RX interchangeably. Each UE is assumed to have one up/down-conversion chain, while no assumptions are made regarding the BS architecture.

Here we assume the communication between the BS and UEs to involve three important phases: (i) initial access (IA) - where the BS and UEs find each other, timing/frequency synchronization is attained and spectral resources are allocated; (ii) analog beamformer design - where the BS and UEs obtain the required aCSI to update the analog precoding/combining beams; and (iii) data transmission. The relative time scale of these phases are illustrated in Fig. 2. Through most of this paper (Sections II-IV), we assume that the IA and beamformer design at the BS are already achieved, and we mainly focus on the beamformer design phase at the UE and the data transmission phase. Therefore we assume perfect timing and frequency synchronization between the BS and UE, and assume that the TX beamforming has been pre-designed based on aCSI at the BS. Later in Section V, we also briefly discuss how aCSI can be acquired at the BS, how IA can be performed and how the use of PACE can be advantageous in those phases.

The BS transmits one spatial data-stream to each scheduled UE, and all such scheduled UEs are served simultaneously via spatial multiplexing. Furthermore, the data to the UEs is assumed to be transmitted via orthogonal precoding beams, such that, there is no inter-user interference.444This type of precoding is possible by avoiding transmission to the scatterers common to multiple scheduled UEs [27]. Under these assumptions and given transmit precoding beams and power allocation, we shall restrict the analysis to one representative UE without loss of generality. For convenience, we shall also assume the use of noise-less and perfectly linear antennas, filters, amplifiers and mixers at both the BS and UE. An analysis including the non-linear effects of these components is beyond the scope of this paper. The BS transmits orthogonal frequency division multiplexing (OFDM) symbols with $K$ sub-carriers, indexed as $\mathcal{K}=\{-K_{1},...,K_{2}-1,K_{2}\}$ with $K_{1}+K_{2}+1=K$ , to this representative UE.555While the proposed PACE technique is also applicable to single carrier transmission, a detailed analysis of the same is beyond the scope of this paper. The BS transmits two kinds of symbols: reference symbols and data symbols. In a reference symbol, only a reference tone, i.e., a sinusoidal signal with a pre-determined frequency known both to the BS and UE, is transmitted on the [math]-th subcarrier, and the remaining sub-carriers are all empty. On the other hand, in a data symbol all the $K$ sub-carriers are used for data transmission.666In an actual implementation the data symbols may have may also have null and pilot sub-carriers, but we ignore them here for simplicity. The purpose of the reference symbols is to aid PACE and beamformer design at the RX, as shall be explained later. Since the BS can afford an accurate oscillator, we shall assume that the BS suffers negligible phase noise. The $M_{\rm tx}\times 1$ complex equivalent transmit signal for the [math]-th symbol, if it is a reference or data symbol, respectively, can then be expressed as:

[TABLE]

for $-T_{\rm cp}\leq t\leq T_{\rm s}$ , where $\mathbf{t}$ is the $M_{\rm tx}\times 1$ unit-norm TX beamforming vector for this UE with $|\mathbf{t}|=1$ , $x^{(\rm d)}_{k}$ is the data signal at the $k$ -th OFDM sub-carrier, ${\rm j}=\sqrt{-1}$ , $f_{\rm c}$ is the carrier/reference frequency, $f_{k}=k/T_{\rm s}$ represents the frequency offset of the $k$ -th sub-carrier, $T_{\rm cs}=T_{\rm cp}+T_{\rm s}$ and $T_{\rm s},T_{\rm cp}$ are the symbol duration and the cyclic prefix duration, respectively. Here we define the complex equivalent signal such that the actual (real) transmit signal is given by $\mathbf{s}^{(\cdot)}_{\rm tx}(t)=\mathrm{Re}\{\tilde{\mathbf{s}}^{(\cdot)}_{\rm tx}(t)\}$ . For the data symbols, we assume the use of Gaussian signaling with $E^{(\rm d)}_{k}=\mathbb{E}\{{|x_{k}|}^{2}\}$ , for each $k\in\mathcal{K}$ . The total average transmit OFDM symbol energy (including cyclic prefix) allocated to the UE is defined as $E_{\rm cs}$ , where $E_{\rm cs}\geq E^{(\rm r)}$ and $E_{\rm cs}\geq\sum_{k\in\mathcal{K}}E^{(\rm d)}_{k}$ . For convenience we also assume that $f_{\rm c}$ is a multiple of $1/T_{\rm cs}$ , which ensures that the reference tone has the same initial phase in consecutive reference symbols.

The channel to the representative UE is assumed to be sparse with $L$ resolvable MPCs ( $L\ll M_{\rm tx},M_{\rm rx}$ ), and the corresponding $M_{\rm rx}\times M_{\rm tx}$ channel impulse response matrix is given as [22]:

[TABLE]

where $\alpha_{\ell}$ is the complex amplitude and $\tau_{\ell}$ is the delay and $\mathbf{a}_{\rm tx}(\ell),\mathbf{a}_{\rm rx}(\ell)$ are the TX and RX array response vectors, respectively, of the $\ell$ -th MPC. As an illustration, the $\ell$ -th RX array response vector for a uniform planar array with $M_{\rm H}$ horizontal and $M_{\rm V}$ vertical elements ( $M_{\rm rx}=M_{\rm H}M_{\rm V}$ ) is given by $\mathbf{a}_{\rm rx}(\ell)=\bar{\mathbf{a}}_{\rm rx}\big{(}\psi^{\rm rx}_{\rm azi}(\ell),\psi^{\rm rx}_{\rm ele}(\ell)\big{)}$ , where we define:

[TABLE]

$\psi^{\rm rx}_{\rm azi}(\ell)$ , $\psi^{\rm rx}_{\rm ele}(\ell)$ are the azimuth and elevation angles of arrival for the $\ell$ -th MPC, $\Delta_{\rm H},\Delta_{\rm V}$ are the horizontal and vertical antenna spacings and $\lambda$ is the wavelength of the carrier signal. Expressions for $\mathbf{a}_{\rm tx}(\ell)$ can be obtained similarly. Note that in (2) we implicitly assume frequency-flat MPC amplitudes $\{\alpha_{0},..,\alpha_{L-1}\}$ and ignore beam squinting effects [54], which are reasonable assumptions for moderate system bandwidths. To prevent inter symbol interference, we also let the cyclic prefix be longer than the maximum channel delay: $T_{\rm cp}>\tau_{L-1}$ . To model a time varying channel, we treat $\{\alpha_{\ell},\mathbf{a}_{\rm tx}(\ell),\mathbf{a}_{\rm rx}(\ell)\}$ as aCSI parameters, that remain constant within an aCSI coherence time and may change arbitrarily afterwards.777While each MPC may contain several unresolved sub-paths, the corresponding set of scatterers are usually co-located. Therefore the relative sub-path delays and resulting MPC amplitude $\alpha_{\ell}$ are expected to vary slowly with the TX/RX movement. However since the channel is more sensitive to delay variations, the MPC delays $\{\tau_{0},...,\tau_{L-1}\}$ are modeled as iCSI parameters that only remain constant within a shorter interval called the iCSI coherence time. Note that this time variation of delays is an equivalent representation of the Doppler spread experienced by the RX. Finally, we do not assume any distribution prior or side information on $\{\alpha_{\ell},\mathbf{a}_{\rm tx}(\ell),\mathbf{a}_{\rm rx}(\ell),\tau_{\ell}\}$ .

The RX front-end is assumed to have a low noise amplifier followed by a band-pass filter at each antenna element that leaves the desired signal un-distorted but suppresses the out-of-band noise. The $M_{\rm rx}\times 1$ filtered complex equivalent received waveform for the [math]-th symbol can then be expressed as:

[TABLE]

for $0\leq t\leq T_{\rm s}$ , where $(\cdot)=({\rm r})\big{/}({\rm d})$ , $\tilde{\mathbf{w}}^{(\cdot)}(t)$ is the $M_{\rm rx}\times 1$ complex equivalent, base-band, stationary, additive, vector Gaussian noise process, with individual entries being circularly symmetric, independent and identically distributed (i.i.d.), and having a power spectral density: $\mathcal{S}_{\rm w}(f)=\mathrm{N_{0}}$ for $-f_{K_{1}}\leq f\leq f_{K_{2}}$ . During the data transmission phase, the $M_{\rm rx}\times 1$ received data waveform $\tilde{\mathbf{s}}^{(\rm d)}_{\rm rx}(t)$ is phase shifted by a bank of phase-shifters, whose outputs are summed and fed to a down-conversion chain for data demodulation, as in conventional analog beamforming. However unlike conventional CE based analog beamforming, the control signals to the phase-shifters are obtained using the reference symbols $\tilde{\mathbf{s}}^{(\rm r)}_{\rm rx}(t)$ and using PACE, as shall be discussed in the next section.

III Analog beamformer design at the receiver

During each beamformer design phase, the BS transmits $D$ consecutive reference symbols to facilitate PACE at the RX. This process involves two steps: locking a local RX oscillator to the received reference tone and using this locked oscillator to estimate the amplitude and phase-offsets at each antenna.888Note that IA based time/frequency synchronization usually involves digital post-processing. Thus prior IA based synchronization does not guarantee that an RX oscillator is locked to the reference tone. Here locking refers to ensuring that the phase difference between the oscillator and the received reference tone is approximately constant. The first $D_{1}$ reference symbols are used for the former step and the remaining $D_{2}=D-D_{1}$ symbols are used for the latter step. Therefore $D$ is independent of $M_{\rm rx}$ and is mainly determined by the time required for oscillator locking (see Remark III.1). The first step shall be referred to as recovery of the reference tone and is analyzed in Section III-A and while the latter step is discussed in Section III-B. As shall be shown both steps are significantly impaired by channel noise. Therefore in Section III-C, we propose an improved architecture for reference tone recovery that provides better noise performance, albeit with a slightly higher hardware complexity. For convenience, we shall assume that the MPC delays do not change within the beamformer design phase, and are represented as $\{\hat{\tau}_{0},...,\hat{\tau}_{L-1}\}$ (see also Remark III.2). However the delays may be different during the data transmission phase, as shall be considered in Section IV. Without loss of generality, assuming the first reference symbol to be the [math]-th OFDM symbol, the complex equivalent RX signal for the $D$ reference symbols at antenna $m$ can be expressed as:999The component of $\tilde{\mathbf{s}}^{(\cdot)}_{\rm rx}(t)$ for $-T_{\rm cp}\leq t\leq 0$ suffers inter-symbol interference and hence is not included here.

[TABLE]

for $0\leq t\leq DT_{\rm cs}-T_{\rm cp}$ , where $A_{m}^{(\rm r)}\triangleq\sum_{\ell=0}^{L-1}\sqrt{\frac{1}{T_{\rm cs}}}\alpha_{\ell}{[\mathbf{a}_{\rm rx}(\ell)]}_{m}{\mathbf{a}_{\rm tx}(\ell)}^{{\dagger}}\mathbf{t}\sqrt{E^{(\rm r)}}e^{-{\rm j}2\pi f_{\rm c}\hat{\tau}_{\ell}}$ is the amplitude of the reference tone at antenna $m$ .

III-A Recovery of the reference tone - using one PLL

For locking a local RX oscillator to the reference signal, we first consider the use of a type 2 analog PLL at RX antenna $1$ , as illustrated in Fig. 3. The PLL is a common carrier-recovery circuit - with a mixer, a loop low pass filter ( $\mathrm{LF}$ ) a variable loop gain ( $G$ ) and a voltage controlled oscillator ( $\mathrm{VCO}$ ) arranged in a feedback mechanism - that can filter the noise from an input noisy sinusoidal signal (see [55, 56] for more details).

Here $\mathrm{LF}$ is assumed to be a first-order active low-pass filter with a transfer function $\mathcal{LF}(s)=1+\epsilon/s$ and the loop gain $G$ is assumed to adapt to the amplitude of the input such that $G|A_{1}^{(\rm r)}|=\textrm{constant}$ .101010Such a variable gain can possibly be implemented by using an automatic gain control circuit. For convenience, we also ignore the VCO’s internal noise [57, 58]. Without loss of generality, let the output of the VCO (i.e. the recovered reference tone) be expressed as:

[TABLE]

where $\theta(t)$ may be arbitrary and we define $\bar{\theta}\in(-\pi,\pi]$ such that $A_{1}^{(\rm r)}e^{-{\rm j}\bar{\theta}}=-{\rm j}|A_{1}^{(\rm r)}|$ . Then the stochastic differential equation governing (14) for $0\leq t\leq DT_{\rm cs}-T_{\rm cp}$ is given by [56]:

[TABLE]

where $f_{\rm vco}$ is the free running frequency of the VCO with no input, we use (13) and assume $f_{\rm c}$ is much larger than the bandwidth of $\mathrm{LF}$ . In this subsection, we are interested in finding the time required for locking ( $D_{1}T_{\rm cs}$ ), i.e., for $\theta(t)$ to (nearly) converge to a constant and characterizing the distribution of the PLL output $s_{\rm PLL}(t)$ , or equivalently $\theta(t)$ , during the last $D_{2}$ reference symbols when the PLL is locked to the reference tone. The first part is answered by the following remark:

Remark III.1.

For the PLL considered, the phase lock acquisition time is $\approx\frac{1}{\epsilon}{\left(\frac{2\pi(f_{\rm c}-f_{\rm vco})}{|A_{1}^{(\rm r)}|G}\right)}^{2}$ in the no noise scenario [55, 56]. Thus $\epsilon$ and $|A_{1}^{(\rm r)}|G$ must be of the orders of $1/T_{\rm s}$ and $2\pi|f_{\rm c}-f_{\rm vco}|$ respectively, to keep $D_{1}$ small.

Numerous techniques [59, 60] have been proposed to further reduce the lock acquisition time, which are not explored here for brevity. In the locked state, it can be shown that $\theta(t)$ suffers from random fluctuations due to the input noise $\tilde{w}^{(\rm r)}_{1}(t)$ in (III-A), and that $\theta(t)$ ( ${\rm modulo}\ 2\pi$ ) is approximately a zero mean random process [55, 56]. This fluctuation manifests as phase noise of $s_{\rm PLL}(t)$ . While several attempts have been made to characterize the locked state $\theta(t)$ (see [56, 55] and references therein), closed form results are available only for a few simple scenarios that are not applicable here. Therefore, for analytical tractability, we linearize (III-A) using the following widely used approximations [56]:

We neglect cycle slips and assume that the deviations of $\theta(t)$ about its mean value are small, such that $e^{-{\rm j}\theta(t)}\approx 1-{\rm j}\theta(t)$ in the locked state. 2. 2.

We assume that the distribution of the base-band noise process $\tilde{w}^{(\rm r)}_{1}(t)$ is invariant to multiplication with $e^{-{\rm j}[\bar{\theta}+\theta(t)]}$ , i.e., $\hat{w}^{(\rm r)}_{1}(t)\triangleq\tilde{w}^{(\rm r)}_{1}(t)e^{-{\rm j}[\bar{\theta}+\theta(t)]}$ is also a Gaussian noise process with power spectral density $\mathcal{S}_{\rm w}(f)$ .

Approximation 1 is accurate in the locked state and in the large SNR regime, while Approximation 2 is accurate when the noise bandwidth is much larger than the loop filter bandwidth [61, 56]. Using these approximations and the definition of $\bar{\theta}$ , we can linearize (III-A) as:

[TABLE]

where we replace $\theta(t)$ by $\theta_{\rm L}(t)$ to denote use of the linear approximation. Note that for sufficient SNR, $\theta(t)\stackrel{{\scriptstyle\rm d}}{{\approx}}\theta_{\rm L}(t)$ ( ${\rm modulo}\ 2\pi$ ) during the last $D_{2}$ reference symbols. Assuming $\theta_{\rm L}(0)=0$ and the PLL input to be [math] for $t\leq 0$ and taking the Laplace transform on both sides of (16), we obtain:

[TABLE]

where $\Theta_{\rm L}(s)$ and $\hat{\mathcal{W}}^{(\rm r)}_{1}(s)$ are the Laplace transforms of $\theta_{\rm L}(t)$ and $\hat{w}^{(\rm r)}_{1}(t)$ , respectively. It can be verified using the final value theorem that the contribution of the last term on the right hand side of (17) vanishes for $t\gg 0$ (i.e., in locked state). Therefore ignoring this term in (17), we observe that $\theta_{\rm L}(t)$ is a zero mean, stationary Gaussian process [57], in the locked state. Furthermore, the locked state power spectral density, auto-correlation function and variance of $\theta_{\rm L}(t)$ can then be computed, respectively, as:

[TABLE]

where $2a=G|A_{1}^{(\rm r)}|+\sqrt{G^{2}{|A_{1}^{(\rm r)}|}^{2}-4G|A_{1}^{(\rm r)}|\epsilon}$ , $2b=G|A_{1}^{(\rm r)}|-\sqrt{G^{2}{|A_{1}^{(\rm r)}|}^{2}-4G|A_{1}^{(\rm r)}|\epsilon}$ , (19)–(20) follow from finding the inverse Fourier transform via partial fraction expansion and the final expressions follow by observing that $\mathcal{S}_{\rm w}(f)\leq\mathrm{N}_{0}$ for all $f$ . Since $\theta_{\rm L}(t)$ is stationary and Gaussian in locked state, note that its distribution is completely characterized by (18)–(20).

III-B Phase and amplitude offset estimation

This subsection analyzes the procedure for reference signal phase and amplitude offset estimation at each RX antenna. As illustrated in Fig. 1, the PLL signal from antenna $1$ is fed to a $\pi/2$ phase shifter to obtain its quadrature component. From (14), the in-phase and quadrature-phase components of the PLL signal for $D_{1}T_{\rm cs}-T_{\rm cp}\leq t\leq DT_{\rm cs}-T_{\rm cp}$ can be expressed together as:

[TABLE]

At each RX antenna, the received reference signal is multiplied by the in-phase and quadrature-phase components of the PLL signal, and the resulting outputs are fed to ‘filter, sample and hold’ circuits. This circuit involves a low pass filter with a bandwidth of $\approx 1/(D_{2}T_{\rm cs})$ , followed by a sample and hold circuit that samples the filtered output at the end of the $D$ reference symbols. For convenience, in this paper we shall approximate this ‘filter, sample and hold’ by an integrate and hold operation as depicted in Fig. 1. Representing the ‘filter, sample and hold’ outputs corresponding to the in-phase and quadrature-phase components of the PLL output as real and imaginary respectively, the $M_{\rm rx}\times 1$ complex sample and hold vector can be approximated as:

[TABLE]

where $\frac{1}{D_{2}}$ is a scaling factor, $T_{1}\triangleq D_{1}T_{\rm cs}-T_{\rm cp}$ , $T_{2}\triangleq DT_{\rm cs}-T_{\rm cp}$ , $\boldsymbol{\hat{\mathcal{H}}}(f_{k})\triangleq\sum_{\ell=0}^{L-1}\alpha_{\ell}\mathbf{a}_{\rm rx}(\ell){\mathbf{a}_{\rm tx}(\ell)}^{{\dagger}}e^{-{\rm j}2\pi(f_{\rm c}+f_{k})\hat{\tau}_{\ell}}$ is the $M_{\rm rx}\times M_{\rm tx}$ frequency-domain channel matrix for the $k$ -th subcarrier during beamformer design phase and $\hat{\mathbf{w}}^{(\rm r)}(t)\triangleq\tilde{\mathbf{w}}^{(\rm r)}(t)e^{-{\rm j}[\bar{\theta}+\theta(t)]}$ is an $M_{\rm rx}\times 1$ i.i.d. Gaussian noise process vector with power spectral density $\mathcal{S}_{\rm w}(f)$ (see Approximation 2). Note that in locked state ( $T_{1}\leq t\leq T_{2}$ ), we have $\theta(t)\stackrel{{\scriptstyle\rm d}}{{\approx}}\theta_{\rm L}(t)$ ( ${\rm modulo}\ 2\pi$ ), as per approximations 1 and 2. Furthermore from (19), the auto-correlation function of $\theta_{\rm L}(t)$ decays exponentially with a time constant of ${\rm O}(1/G|A_{1}^{(\rm r)}|)$ . Therefore, for $G|A_{1}^{(\rm r)}|\gg 1/(D_{2}T_{\rm cs})$ , $\mathbf{I}_{\rm PACE}$ experiences enough independent realizations of $\theta(t)$ . Therefore replacing the integral in (22) with an expectation over $\mathrm{VCO}$ phase noise, we have:

[TABLE]

where $\stackrel{{\scriptstyle(1)}}{{\approx}}$ follows from the fact that $\theta(t)\stackrel{{\scriptstyle\rm d}}{{\approx}}\theta_{\rm L}(t)$ ( ${\rm modulo}\ 2\pi$ ) in locked state, $\stackrel{{\scriptstyle(2)}}{{=}}$ follows by defining $\hat{\mathbf{W}}^{(\rm r)}\triangleq\frac{1}{D_{2}\sqrt{T_{\rm cs}}}\int_{T_{1}}^{T_{2}}\hat{\mathbf{w}}^{(\rm r)}(t){\rm d}t$ and by using the characteristic function for the stationary Gaussian process $\theta_{\rm L}(t)$ . Since $\hat{\mathbf{w}}^{(\rm r)}(t)$ is i.i.d. Gaussian with a power spectral density $\mathcal{S}_{\rm w}(f)$ , it can be verified that $\hat{\mathbf{W}}^{(\rm r)}\sim\mathcal{CN}[\mathbb{O}_{M_{\rm rx}\times 1},(\mathrm{N}_{0}/D_{2})\mathbb{I}_{M_{\rm rx}}]$ when $\frac{1}{D_{2}}\ll K_{1},K_{2}$ . From (23), note that the signal component of the sample and hold output $\mathbf{I}_{\rm PACE}$ is directly proportional to the channel matrix at the reference frequency. The outputs are used as a control signals to the RX phase-shifter array, to generate the RX analog beam to be used during the data transmission phase. From (23) and (20), note that either $D_{2}$ or $|A_{1}^{(\rm r)}|$ can be increased, to reduce the impact of noise $\hat{\mathbf{W}}^{(\rm r)}$ on the analog beam. Since $|A_{1}^{(\rm r)}|$ is a non-decreasing function of $E^{(\rm r)}$ (see (13)), this implies that $E^{(\rm r)}$ should be kept as large as possible while satisfying $E^{(\rm r)}\leq E_{\rm cs}$ and meeting the spectral mask regulations.

Note that the results in this section are based on several approximations, including the linear phase noise analysis in Section III-A. To test the accuracy of these results, the numerical values of $\big{|}\int_{T_{1}}^{T_{2}}e^{-{\rm j}\theta(t)}{\rm d}t\big{|}\big{/}{D_{2}T_{\rm cs}}$ , obtained by simulating realizations of $\theta(t)$ from (III-A), are compared to its analytic approximation $e^{-\frac{\mathrm{Var}\{\theta_{\rm L}(t)\}}{2}}$ in Fig. 4. Note that this comparison reflects the accuracy of the approximation in (23). As is evident from Fig. 4, (23) is accurate above a certain SNR. Additionally, since $\mathbf{I}_{\rm PACE}$ decays exponentially with $\mathrm{Var}\{\theta_{\rm L}(t)\}$ (see (23)), we observe from Fig. 4a that the mean integrator output drops drastically below a certain threshold SNR. As shall be shown in Section IV, such a drop in the mean causes a sharp degradation in the system performance below this threshold SNR. Therefore in the next subsection we propose a better reference recovery circuit, called weighted carrier arraying, that reduces the SNR threshold.

Remark III.2.

The preceding derivations assumed that the MPC delays are identical for the $D$ reference symbols. However since the PLL continuously tracks the RX signal and phase/amplitude estimation at each antenna is performed simultaneously, these results are valid even if the delays change slowly within the beamformer design phase.

Remark III.3.

The RX phase-shifter array or the down-conversion chain are not utilized during the $D$ reference symbols of the beamformer design phase. Therefore, data reception is also possible during these $D$ reference symbols in parallel, as long as a sufficient guard band between the data sub-carriers and the reference sub-carrier is provided (similar to (35)) to reduce impact on the PLL performance.

Note that in a multi-cell scenario, use of the same reference tone in adjacent cells can cause reference tone contamination, i.e., $\mathbf{I}_{\rm PACE}$ may contain components corresponding to the channel from a neighboring BS. This is analogous to pilot contamination in conventional CE approaches [1], and can be avoided by using different, well-separated reference frequencies in adjacent cells.

III-C Recovery of the reference tone - using weighted carrier arraying

For reducing the PLL SNR threshold and improving performance, in this subsection we propose a new reference recovery technique called weighted carrier arraying, as illustrated in Fig. 5. Apart from a main primarly PLL, weighted carrier arraying has secondary PLLs at a subset $\mathcal{M}$ of antennas, which compensate for the inter-antenna phase shift. The resulting phase compensated signals from the $\mathcal{M}$ antennas are weighted, combined and tracked by the primary PLL, which operates at a higher SNR and with a wider loop bandwidth than the secondary PLLs. Note that this architecture can be interpreted as a generalization of the carrier recovery process in [62, 63, 50, 51] that allows weighted combining. We shall next analyze the performance of this arrayed PLL in the locked state. However, an analysis of the transient behavior and lock acquisition time of this design is beyond the scope of this paper.

In Fig. 5, $\mathrm{LPF}/\mathrm{BPF}$ refer to low-pass and band-pass filters with wide bandwidths, designed only to remove the unwanted side-band of the mixer outputs. Without loss of generality, we express the outputs of the primary and secondary VCOs as:111111Another convergence point for $s^{\rm p}_{\rm vco}(t)$ is at a frequency of $(f_{\rm c}+f_{\rm IF})$ . But the final results presented here are also valid for this alternate convergence point.

[TABLE]

respectively, where $\theta(t),\phi_{m}(t)$ are arbitrary, $f_{\rm IF}$ is the common free running frequency of the secondary VCOs, and $\bar{\phi}_{m}$ are such that $A_{m}^{(\rm r)}e^{-{\rm j}[\bar{\phi}_{m}]}=-{\rm j}|A_{m}^{(\rm r)}|$ for all $m\in\mathcal{M}$ . Now similar to Section III-A, from (13) the differential equation governing the secondary PLL at antenna $m\in\mathcal{M}$ can be expressed as:

[TABLE]

where we define $\hat{w}_{m}^{(\rm r)}(t)\triangleq\tilde{w}_{m}^{(\rm r)}(t)e^{-{\rm j}[\bar{\phi}_{m}+\phi_{m}(t)+\theta(t)]}$ and $G^{\rm s}_{m}$ is the loop gain of the secondary VCO at antenna $m$ . Similarly, for the primary VCO we have:

[TABLE]

where $f^{\rm p}_{\rm vco}$ is the free running frequency of the primary VCO, $G^{\rm p}$ is the loop gain and $\mathrm{LF}_{\rm p}$ is an active low pass filter with transfer function $\mathrm{LF}_{\rm p}(s)=(1+\epsilon^{\rm p}/s)$ . Similar to Section III-A, to obtain the locked state distribution of $\theta(t)$ we shall rely on the linear PLL analysis by using: 1) $e^{-{\rm j}[\phi_{m}(t)+\theta(t)]}\approx 1-{\rm j}[\phi_{m}(t)+\theta(t)]$ , which is accurate in the high SNR locked state where $\phi_{m}(t)+\theta(t)\ll 1$ and 2) $\hat{w}_{m}^{(\rm r)}(t)\stackrel{{\scriptstyle\rm d}}{{\approx}}\tilde{w}_{m}^{(\rm r)}(t)$ , which is accurate for a wide noise bandwidth. Using these approximations in (24)–(25) with zero initial conditions and taking Laplace transforms, we obtain:

[TABLE]

where $\hat{\mathcal{W}}^{(\rm r)}_{m}(s)$ , $\Theta_{\rm L}(s)$ and $\Phi^{\rm L}_{m}(s)$ are the Laplace transforms of $\hat{w}^{(\rm r)}_{m}(t)$ , linear approximation $\theta_{\rm L}(t)$ and linear approximation $\phi^{{}_{\rm L}}_{m}(t)$ , respectively. We assume that the loop gains of the PLLs adapt to the amplitudes of the input such that $|A_{m}^{(\rm r)}|G^{\rm s}_{m}=\mu\ \forall m\in\mathcal{M}$ and $\sum_{m\in\mathcal{M}}G^{\rm p}{|A_{m}^{(\rm r)}|}^{2}=\text{constant}$ . Then solving the system of equations in (26), we obtain:

[TABLE]

It can be verified using the final value theorem that the last term in (III-C) only contributes a constant phase shift for $t\gg 0$ (in locked state), say $\bar{\theta}_{\rm L}$ .121212Simulations suggest this constant phase shift for the actual non-linear system (24)–(25) is noise dependent. However such an arbitrary, but constant, phase shift does not impact the resulting beamforming gain if cycle skipping probability is low. Thus, using steps similar to Section III-A, we can obtain the locked state power spectral density and variance of the time varying part of $\theta_{\rm L}(t)$ , i.e., $\theta_{\rm L}(t)-\bar{\theta}_{\rm L}$ , as:

[TABLE]

where ${[A^{(\rm r)}_{\rm rss}]}^{2}=\sum_{m\in\mathcal{M}}{|A_{m}^{(\rm r)}|}^{2}$ . Comparing (29) to (20), note that the PLL phase noise is essentially reduced by the maximal ratio combining gain corresponding to the $\mathcal{M}$ antennas. As this variation in $\theta_{\rm L}(t)$ manifests as phase noise of $s_{\rm PLL}(t)$ in Fig. 5, the ‘filter, sample and hold’ outputs with weighted carrier arraying can be obtained by using (29) in (23). The accuracy of the resulting approximation is studied via simulations in Fig. 4.

IV Data transmission

During the data transmission phase, OFDM symbols of type (1b) are transmitted and the corresponding received signals are processed via the phase-shifter array with $\mathbf{I}_{\rm PACE}$ as the control signals. Without loss of generality, again assuming the [math]-th OFDM symbol as a representative data symbol, the combined data signal at the RX for $0\leq t\leq T_{\rm s}$ can be expressed as:\footrefnote2

[TABLE]

where the $1/\sqrt{2}$ is a scaling constant for convenience and we assume that the MPC delays for this representative data symbol are $\{\tau_{0},...,\tau_{L-1}\}$ . This phase shifted and combined signal $R(t)$ is then converted to base-band by a separate RX oscillator, and any resulting phase noise is assumed to be mitigated via some digital phase noise compensation techniques [64, 65, 58, 66]. Therefore neglecting the down-conversion phase noise, the resulting base-band signal can be expressed as $R_{\rm BB}(t)=R(t)e^{-{\rm j}2\pi f_{\rm c}t}$ . This signal is then sampled and OFDM demodulation follows. The OFDM demodulation output for the $k$ -th subcarrier ( $k\in\mathcal{K}$ ) is then given by:

[TABLE]

where $\boldsymbol{\mathcal{H}}(f_{k})\triangleq\sum_{\ell=0}^{L-1}\alpha_{\ell}\mathbf{a}_{\rm rx}(\ell){\mathbf{a}_{\rm tx}(\ell)}^{{\dagger}}e^{-{\rm j}2\pi(f_{\rm c}+f_{k})\tau_{\ell}}$ is the $M_{\rm rx}\times M_{\rm tx}$ frequency domain channel matrix for the $k$ -th data subcarrier and $\tilde{\mathbf{W}}^{(\rm d)}[k]\triangleq\frac{\sqrt{T_{\rm cs}}}{K}\sum_{u=0}^{K-1}\tilde{\mathbf{w}}^{(\rm d)}(\frac{uT_{\rm s}}{K})e^{-{\rm j}\frac{2\pi ku}{K}}$ , with $\tilde{\mathbf{W}}^{(\rm d)}[k]$ being independently distributed for each $k\in\mathcal{K}$ as $\tilde{\mathbf{W}}^{(\rm d)}[k]\sim\mathcal{CN}[\mathbb{O}_{M_{\rm rx}\times 1},(\mathrm{N}_{0}T_{\rm cs}/T_{\rm s})\mathbb{I}_{M_{\rm rx}}]$ . Note from (23) that $\mathbf{I}^{{\dagger}}_{\rm PACE}$ is similar (with appropriate scaling), but not identical, to the MRC beamformer for the $k$ -th sub-carrier: $\mathbf{t}^{{\dagger}}\boldsymbol{\mathcal{H}}(f_{k})^{{\dagger}}$ . The mismatch is due to the beamforming noise $\hat{\mathbf{W}}^{(\rm r)}$ and because the reference symbols and the $k$ -th sub-carrier data stream pass through slightly different channels, owing to the difference in sub-carrier frequencies and the MPC delays ( $\hat{\tau}_{\ell}\neq\tau_{\ell}$ ). Consequently, the beamformer $\mathbf{I}_{\rm PACE}$ only achieves imperfect MRC, leading to some loss in performance and causing the effective channel coefficients $\mathbf{I}_{\rm PACE}^{{\dagger}}\boldsymbol{\mathcal{H}}(f_{k})\mathbf{t}$ to vary with the sub-carrier index $k$ , i.e., the system experiences frequency-selective fading. Furthermore, since the MPC delays $\{\tau_{0},..,\tau_{L-1}\}$ change after every iCSI coherence time, so may these channel coefficients. As depicted in Fig. 2, we assume that the TX transmits pilot symbols within each iCSI coherence time to facilitate estimation of these coefficients $\big{\{}\mathbf{I}_{\rm PACE}^{{\dagger}}\boldsymbol{\mathcal{H}}(f_{k})\mathbf{t}\big{|}k\in\mathcal{K}\big{\}}$ at the RX. Since these pilots are used only to estimate the effective single-input-single-output (SISO) channel and not the actual MIMO channel, the corresponding overhead is small and shall be neglected here. Assuming perfect estimates of these channel coefficients, from (30) the effective SNR for the $k$ -th sub-carrier, and the instantaneous system spectral efficiency (iSE), respectively, can be expressed as:

[TABLE]

where we neglect the cyclic prefix overhead in (32) for convenience. Note that the iSE maximizing data power allocation $\{E_{k}^{(\rm d)}|k\in\mathcal{K}\}$ can be obtained via water-filling across the sub-carriers. While the exact expressions for (31)–(32) are involved, their expectations with respect to $\mathbf{I}_{\rm PACE}$ can be bounded, as stated by the following theorem.

Theorem IV.1.

If the RX array response vectors for the channel MPCs are mutually orthogonal, i.e., $\mathbf{a}_{\rm rx}(\ell)^{{\dagger}}\mathbf{a}_{\rm rx}(i)=0$ for $\ell\neq i$ , the effective SNR and iSE, averaged over the beamformer noise $\hat{\mathbf{W}}^{(\rm r)}$ , can be bounded as in (33)

where $\beta(\dot{f},\ddot{f})=\sum_{\ell=0}^{L-1}{|\alpha_{\ell}|}^{2}{|{\mathbf{a}_{\rm tx}(\ell)}^{{\dagger}}\mathbf{t}|}^{2}e^{{\rm j}[2\pi\dot{f}(\hat{\tau}_{\ell}-\tau_{\ell})-2\pi\ddot{f}\tau_{\ell}]}$ and $\gtrapprox$ represents a $\geq$ inequality at a high enough SNR such that the approximations in Section III are accurate.

Proof.

Substituting (23) in (30), and by treating the received signal component corresponding to $\hat{\mathbf{W}}^{(\rm r)}$ , i.e., ${[\hat{\mathbf{W}}^{(\rm r)}]}^{{\dagger}}\boldsymbol{\mathcal{H}}(f_{k})\mathbf{t}x_{k}$ , as noise, we can obtain a lower bound to the mean SNR as:

[TABLE]

where $\stackrel{{\scriptstyle(1)}}{{\geq}}$ follows from the Jensen’s inequality and $\stackrel{{\scriptstyle(2)}}{{=}}$ from the orthogonality of the array response vectors. Similarly, by treating ${[\hat{\mathbf{W}}^{(\rm r)}]}^{{\dagger}}\boldsymbol{\mathcal{H}}(f_{k})\mathbf{t}x_{k}$ as Gaussian noise independent of $x_{k}$ , a lower bound on the mean iSE can be obtained as:

[TABLE]

where we use similar steps to (34). ∎

The array response orthogonality condition in Theorem IV.1 is satisfied if the scatterers corresponding to different MPCs are well separated and $M_{\rm rx}\gg L$ [67]. Note that even though the RX does not explicitly estimate the array response vectors $\mathbf{a}_{\rm rx}(\ell)$ for the MPCs, we still observe an RX beamforming gain of $M_{\rm rx}$ in (33a). The impact of imperfect MRC combining and the resulting frequency-selective fading is quantified by $\beta(f_{\rm c},f_{k})$ , where note that $|\beta(f_{\rm c},f_{k})|\leq|\beta(0,0)|$ . Another drawback of the fading is that it may cause a drastic drop in performance of the one PLL architecture in Section III-A if $|A_{1}^{(\rm r)}|$ - the reference signal strength at the antenna $1$ - falls in a fading dip, as is evident from (20) and (33). Note however that the weighted arraying architecture in Section III-C enjoys diversity against such fading by recovering the reference tone from multiple antennas $\mathcal{M}$ .

V Initial access and aCSI estimation at the BS

In this section we suggest how aCSI can be acquired at the BS during the TX beamformer design phase and also propose a sample IA protocol that can utilize PACE. Note that power allocation, user-scheduling and design of the TX beamformer $\mathbf{t}$ requires knowledge of the TX array response vectors and amplitudes $\{|\alpha_{\ell}|,\mathbf{a}_{\rm tx}(\ell)\}$ for the different UEs. Such aCSI can be acquired at the BS either via uplink CE, or by downlink CE with CSI feedback from the RX. Uplink CE can be performed by transmitting an orthogonal pilot from each UE omni-directionally, and using any of the digital CE algorithms from Section I at the BS. Note that PACE cannot be used at the BS since the pilots from multiple UEs need to be separated via digital processing. For downlink CE with feedback, the BS transmits reference signals sequentially along different transmit precoder beams (beam sweeping), with $D$ reference symbols for each beam. The UEs perform PACE for each TX beam, and provide the BS with uplink feedback about the corresponding link strength for data transmission.

The suggested IA protocol is somewhat similar to the downlink CE with feedback, where the BS performs beam sweeping along different angular directions, possibly with different beam widths. For each TX beam, the BS transmits $D$ reference symbols, followed by a sequence of primary (PSS) and secondary synchronization sequences (SSS). The RX performs PACE, and provides uplink feedback to the BS upon successfully detecting a PSS. However due to lack of prior timing synchronization during IA phase, the ‘filter, sample and hold’ circuit in Section III-B cannot be used directly for the PACE. One alternative is to allow continuous transmission of the reference tone even during the PSS and SSS with the following suggested symbol structure:

[TABLE]

where $\mathcal{G}$ defines a guard band around the reference tone, to reduce the impact of the data sub-carriers on the PLL output. The amplitude and phase estimation can then be performed similar to Section III-B, by multiplying the received signal at each antenna with the PLL output and then filtering with a low pass filter with cut-off frequency $1/(D_{2}T_{\rm cs})$ . Due to the continuous availability of the reference tone, the filter outputs can be directly used to control the phase shifter at each antenna without the ‘sample and hold’ operation. Since $D={\rm O}(1)$ , the IA latency does not scale with $M_{\rm rx}$ and yet the PSS/SSS symbols can exploit the RX beamforming gain, thus improving cell discovery radius and/or reducing IA overhead.

VI Simulation Results

For the simulation results, we consider a single cell scenario with a $\lambda/2$ -spaced $32\times 8$ ( $M_{\rm tx}=256$ ) antenna BS and one representative UE with a $\lambda/2$ -spaced $16\times 4$ ( $M_{\rm rx}=64$ ) antenna array, having one down-conversion chain and using PACE aided beamforming. The BS has perfect aCSI and transmits one spatial OFDM data stream to this UE with $K=1024$ sub-carriers and the beamformer $\mathbf{t}$ aligned with the strongest channel MPC. The RX beamformer design phase is assumed to last $D=6$ symbols with $D_{2}=2$ , where the BS transmits reference symbols with power $E^{(\rm r)}=20E_{\rm cs}/K$ (to satisfy spectral mask regulations). The system parameters for the one PLL and weighted arraying case, respectively, are as given in Table I on page I. For comparison to existing schemes, we include the performance of RTAT - the continuous ACE based beamforming scheme in [52], and of statistical RX analog beamforming [13], where the beamformer is the largest eigen-vector of the RX spatial correlation matrix: $\mathbf{R}_{\rm rx}(\mathbf{t})=\frac{1}{K}\sum_{k\in\mathcal{K}}\boldsymbol{\hat{\mathcal{H}}}(f_{k})\mathbf{t}\mathbf{t}^{{\dagger}}{\boldsymbol{\hat{\mathcal{H}}}(f_{k})}^{{\dagger}}$ . For both these schemes we ignore impact of phase noise and additionally, for statistical beamforming we consider two cases: (a) perfect knowledge of $\mathbf{R}_{\rm rx}(\mathbf{t})$ at the RX and (b) estimate of $\mathbf{R}_{\rm rx}(\mathbf{t})$ obtained using sparse-ruler sampling [37] - a reduced complexity digital CE technique. Note that PACE uses $6$ reference symbols per beamformer update phase, RTAT avoids reference symbols but requires continuous transmission of the reference and sparse-ruler sampling requires $21$ pilot symbols for $M_{\rm rx}=64$ .

We first consider a sparse multi-path channel having $L=3$ MPCs with delays $\hat{\tau}_{\ell}=\{0,20,40\}$ ns, angles of arrival $\psi^{\rm rx}_{\rm azi}=\{0,\pi/6,-\pi/6\}$ , $\psi^{\rm rx}_{\rm ele}=\{0.45\pi,\pi/2,\pi/2\}$ and effective amplitudes $\frac{\alpha_{\ell}\mathbf{a}_{\rm tx}(\ell)^{{\dagger}}\mathbf{t}}{\sqrt{\beta(0,0)}}=\{\sqrt{0.6},-\sqrt{0.3},\sqrt{0.1}\}$ , respectively, during the RX beamformer design phase and $\tau_{\ell}=\hat{\tau}_{\ell}+\{30,25,25\}$ ps for one snapshot of the data transmission phase. For this channel, the mean iSE of PACE aided beamforming, obtained using Monte-Carlo simulations with the non-linear PLL equations (III-A), (24), (25), is compared to the analytical approximation (33b), and the performance other schemes in Fig. 6a. Since the RX beamformer $\mathbf{I}_{\rm PACE}$ in (22) is random, the one sigma interval of iSE is also depicted as a shaded region here. As is evident from the results, the beamforming gain with PACE aided beamforming is only $2$ dB lower than that of statistical beamforming, above a certain SNR threshold. Below this threshold, however, PACE experiences an exponential decay in peformance due to the oscillator phase noise, as also predicted by Theorem IV.1. As is expected, this SNR threshold is lower for weighted carrier arraying than for one PLL. Furthermore, the derived analytical approximations are also accurate above this SNR threshold. PACE also outperforms RTAT at high SNR due to the judicious transmission of the reference, while the deceptively better performance of RTAT at low SNR is due to neglect of phase-noise. Note that these PACE results are obtained for an oscillator offset of $5$ MHz (see Table I on page I). Better performance can be achieved if the PLL is optimized for more accurate local oscillators.

To study the impact of more realistic channels and number of MPCs, we next model the channel as a rich scattering stochastic channel with $L$ resolvable MPCs, each with $10$ unresolved sub-paths. Here the MPCs and sub-paths are generated identically to the clusters and rays, respectively, in the 3GPP TR38.900 Rel 14 channel model (UMi NLoS scenario) [68]. The only difference from [68] is that we use an intra-cluster delay spread of $1ns$ and an intra-cluster angle spread of $\pi/50$ (for all elevation, azimuth, arrival and departure), to ensure that the sub-paths of each MPC are unresolvable. The channel SNR at each RX antenna (including the TX beamforming gain) is fixed at [math] dB, and the channel variation between beamformer design phase and one snapshot of the data transmission phase is modeled by assuming that the RX moves a distance of $d=2$ cm in a random azimuth direction without changing its orientation. Note that this channel can also be represented by our system model by replacing $L$ in (2) with $10L$ . For this stochastic channel model, the mean iSE for PACE aided beamforming, averaged over channel realizations, is compared to RTAT and statistical beamforming in Fig. 6b. For computational tractability, we skip the non-linear PLL simulation and use the analytical expressions (23) and (32) to quantify performance of PACE.131313Note that (33b) is not applicable due to non-orthogonality of array response vectors. These expressions are accurate at [math]dB SNR as observed from Fig. 6a. As observed from the results, the loss in beamforming gain for PACE aided beamforming increases with $L$ , and therefore PACE is mainly suitable for channels with $L\leq 10$ resolvable MPCs. It must be emphasized that such cases may frequently occur at mm-wave frequencies, where the number of resolvable MPCs/clusters with significant energy (within $20$ dB of the strongest) is on the order $3-10$ [22, 68].

Note that for the iSE results in this section, we did not include the CE overhead. While digital appoaches like sparse ruler sampling [37, 16] require $21$ pilots (for $M_{\rm rx}=64$ ), PACE uses only $D=6$ pilots. The corresponding overhead reduction is significant when downlink CE with feedback is used for aCSI acquisition at the BS, such as in frequency division duplexing systems.141414Even in time division duplexing systems, dowlink CE with feedback may be used during the IA phase, causing a large IA latency. PACE can help reduce IA latency in such situations. For example with exhaustive beamscanning [20] at the TX and an aCSI coherence time of $10$ ms, the BS aCSI acquisition overhead reduces from $40\%$ for sparse ruler techniques to $11\%$ for PACE (see Section V for protocol). The overhead reduction is expected to be higher if the additional time required for beam switching and settling [43, 44] are also taken into account. Thus, PACE aided beamforming shows potential in solving the CE overhead issue of hybrid massive MIMO systems, with minimal degradation in performance.

VII Conclusions

This paper proposes the use of PACE for designing the RX beamformer in massive MIMO systems. This process involves transmission of a reference sinusoidal tone during each beamformer design phase, and estimation of its received amplitude and phase at each RX antenna using analog hardware. A one PLL based carrier recovery circuit is proposed to enable the PACE receiver, and its analysis suggests that the quality of obtained channel estimates decay exponentially with inverse of the SNR at the PLL input. To remedy this and also to obtain diversity against fading, a multiple PLL based weighted carrier arraying architecture is also proposed. The performance analysis suggests that PACE aided beamforming can be interpreted as using the channel estimates on one sub-carrier to perform beamforming on other sub-carriers, with an additional loss factor corresponding to the circuit phase-noise. Simulation results suggest that PACE aided beamforming suffers only a small beamforming loss in comparison to conventional analog beamforming in sparse channels, at sufficiently high SNR. This loss however increases with the number of channel MPCs $L$ , and hence PACE is mostly suitable for sparse channels with few MPCs. The CE overhead reduction with PACE is significant when downlink CE with feedback is required. Benefits of PACE aided beamforming during IA phase are also discussed, although a more detailed analysis will be a subject for future work. Similarly the performance of PACE at very low SNR and with system mismatches/imperfections also requires more attention.

Acknowledgement

The authors would like to thank Dr. W. C. Lindsey and Dr. H. Hashemi of University of Southern California for their helpful comments regarding phase-locked loops.

Bibliography68

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] T. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station antennas,” IEEE Transactions on Wireless Communications , vol. 9, pp. 3590–3600, November 2010.
2[2] F. Boccardi, R. Heath, A. Lozano, T. Marzetta, and P. Popovski, “Five disruptive technology directions for 5G,” IEEE Communications Magazine , vol. 52, pp. 74–80, February 2014.
3[3] B. Murmann, “ADC performance survey 1997-2018 (ISSCC & VLSI Symposium).” available at: https://web.stanford.edu/~murmann/adcsurvey.html .
4[4] R. W. Heath, N. González-Prelcic, S. Rangan, W. Roh, and A. M. Sayeed, “An overview of signal processing techniques for millimeter wave MIMO systems,” IEEE Journal of Selected Topics in Signal Processing , vol. 10, pp. 436–453, April 2016.
5[5] A. F. Molisch, V. V. Ratnam, S. Han, Z. Li, S. L. H. Nguyen, L. Li, and K. Haneda, “Hybrid beamforming for massive MIMO: A survey,” IEEE Communications Magazine , vol. 55, pp. 134–141, Sept 2017.
6[6] X. Zhang, A. Molisch, and S.-Y. Kung, “Variable-phase-shift-based RF-baseband codesign for MIMO antenna selection,” IEEE Transactions on Signal Processing , vol. 53, pp. 4091–4103, Nov 2005.
7[7] O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. Heath, “Spatially sparse precoding in millimeter wave MIMO systems,” IEEE Transactions on Wireless Communications , vol. 13, pp. 1499–1513, March 2014.
8[8] Z. Xu, S. Han, Z. Pan, and C.-L. I, “Alternating beamforming methods for hybrid analog and digital MIMO transmission,” in IEEE International Conference on Communications (ICC) , pp. 1595–1600, June 2015.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Periodic Analog Channel Estimation Aided Beamforming for Massive MIMO Systems

Abstract

Index Terms:

I Introduction

II General Assumptions and System model

III Analog beamformer design at the receiver

III-A Recovery of the reference tone - using one PLL

Remark III.1**.**

III-B Phase and amplitude offset estimation

Remark III.2**.**

Remark III.3**.**

III-C Recovery of the reference tone - using weighted carrier arraying

IV Data transmission

Theorem IV.1**.**

Proof.

V Initial access and aCSI estimation at the BS

VI Simulation Results

VII Conclusions

Acknowledgement

Remark III.1.

Remark III.2.

Remark III.3.

Theorem IV.1.