Spatial Channel Covariance Estimation for Hybrid Architectures Based on   Tensor Decompositions

Sungwoo Park; Anum Ali; Nuria Gonz\'alez-Prelcic; Robert W.; Heath Jr

arXiv:1902.06297·eess.SP·February 19, 2019

Spatial Channel Covariance Estimation for Hybrid Architectures Based on Tensor Decompositions

Sungwoo Park, Anum Ali, Nuria Gonz\'alez-Prelcic, Robert W., Heath Jr

PDF

TL;DR

This paper introduces a tensor decomposition-based method for estimating spatial channel covariance in hybrid architectures, improving accuracy especially at low SNR levels by leveraging low-rank tensor representations.

Contribution

It proposes a novel tensor decomposition approach for covariance estimation in hybrid MIMO systems, outperforming existing compressive sensing and angle-of-arrival methods.

Findings

01

Achieves higher estimation accuracy than prior methods.

02

Performs better at low SNR conditions.

03

Provides theoretical bounds on estimation accuracy.

Abstract

Spatial channel covariance information can replace full instantaneous channel state information for the analog precoder design in hybrid analog/digital architectures. Obtaining spatial channel covariance estimation, however, is challenging in the hybrid structure due to the use of fewer radio frequency (RF) chains than the number of antennas. In this paper, we propose a spatial channel covariance estimation method based on higher-order tensor decomposition for spatially sparse time-varying frequency-selective channels. The proposed method leverages the fact that the channel can be represented as a low-rank higher-order tensor. We also derive the Cram\'er-Rao lower bound on the estimation accuracy of the proposed method. Numerical results and theoretical analysis show that the proposed tensor-based approach achieves higher estimation accuracy in comparison with prior…

Equations152

Y_{(n)} = A X_{(n)} .

Y_{(n)} = A X_{(n)} .

[[X; A^{(1)}, ..., A^{(N)}]] = X \times_{1} A^{(1)} \times_{2} A^{(2)} \dots \times_{N} A^{(N)} .

[[X; A^{(1)}, ..., A^{(N)}]] = X \times_{1} A^{(1)} \times_{2} A^{(2)} \dots \times_{N} A^{(N)} .

[[A^{(1)}, ..., A^{(N)}]] = I \times_{1} A^{(1)} \times_{2} A^{(2)} \dots \times_{N} A^{(N)} .

[[A^{(1)}, ..., A^{(N)}]] = I \times_{1} A^{(1)} \times_{2} A^{(2)} \dots \times_{N} A^{(N)} .

∥ X ∥ = i_{1} = 1 \sum I_{1} i_{1} = 2 \sum I_{2} \dots i_{N} = 1 \sum I_{N} ∣ x_{i_{1} i_{2} \dots i_{N}} ∣^{2},

∥ X ∥ = i_{1} = 1 \sum I_{1} i_{1} = 2 \sum I_{2} \dots i_{N} = 1 \sum I_{N} ∣ x_{i_{1} i_{2} \dots i_{N}} ∣^{2},

X = x^{(1)} \circ x^{(2)} \circ \dots \circ x^{(N)},

X = x^{(1)} \circ x^{(2)} \circ \dots \circ x^{(N)},

X = r = 1 \sum R x_{r}^{(1)} \circ x_{r}^{(2)} \circ \dots \circ x_{r}^{(N)},

X = r = 1 \sum R x_{r}^{(1)} \circ x_{r}^{(2)} \circ \dots \circ x_{r}^{(N)},

a (ϕ_{ℓ}) = [1 e^{\frac{j 2 π d _{a} s i n ( ϕ _{ℓ} )}{λ}} \dots e^{\frac{j 2 π d _{a} ( N _{ant} - 1 ) s i n ( ϕ _{ℓ} )}{λ}}]^{T} .

a (ϕ_{ℓ}) = [1 e^{\frac{j 2 π d _{a} s i n ( ϕ _{ℓ} )}{λ}} \dots e^{\frac{j 2 π d _{a} ( N _{ant} - 1 ) s i n ( ϕ _{ℓ} )}{λ}}]^{T} .

h_{t} [d] = ℓ = 1 \sum L_{ch} g_{t, ℓ} p_{PS} (d T_{s} - τ_{ℓ}) a (ϕ_{ℓ}) for d = 0, ..., N_{CP} - 1.

h_{t} [d] = ℓ = 1 \sum L_{ch} g_{t, ℓ} p_{PS} (d T_{s} - τ_{ℓ}) a (ϕ_{ℓ}) for d = 0, ..., N_{CP} - 1.

h_{t, k} = ℓ = 1 \sum L_{ch} g_{t, ℓ} c_{k, ℓ} a (ϕ_{ℓ}),

h_{t, k} = ℓ = 1 \sum L_{ch} g_{t, ℓ} c_{k, ℓ} a (ϕ_{ℓ}),

r_{t, k} = h_{t, k} s_{t, k} + z_{t, k} .

r_{t, k} = h_{t, k} s_{t, k} + z_{t, k} .

y_{t, k} = s_{t, k}^{*} W^{*} r_{t, k} = x_{t, k} + n_{t, k},

y_{t, k} = s_{t, k}^{*} W^{*} r_{t, k} = x_{t, k} + n_{t, k},

H = [[A, C, G]] = I \times_{1} A \times_{2} C \times_{3} G = ℓ = 1 \sum L_{ch} a_{ℓ} \circ c_{ℓ} \circ g_{ℓ} .

H = [[A, C, G]] = I \times_{1} A \times_{2} C \times_{3} G = ℓ = 1 \sum L_{ch} a_{ℓ} \circ c_{ℓ} \circ g_{ℓ} .

Y = X + N,

Y = X + N,

X = H \times_{1} W^{*} \in C^{M_{RF} \times K_{sbcr} \times T_{frm}} .

X = H \times_{1} W^{*} \in C^{M_{RF} \times K_{sbcr} \times T_{frm}} .

X = [[B, C, G]] .

X = [[B, C, G]] .

R_{h} = \frac{1}{K _{sbcr} T _{frm}} k = 1 \sum K_{sbcr} t = 1 \sum T_{frm} H (:, k, t) H (:, k, t)^{*} = \frac{1}{K _{sbcr} T _{frm}} H_{(1)} H_{(1)}^{*} = \frac{1}{K _{sbcr} T _{frm}} A (G ⊙ C)^{T} (G ⊙ C)^{C} A^{*} = \frac{1}{K _{sbcr} T _{frm}} A (G^{*} G ⊚ C^{*} C) A^{*} .

R_{h} = \frac{1}{K _{sbcr} T _{frm}} k = 1 \sum K_{sbcr} t = 1 \sum T_{frm} H (:, k, t) H (:, k, t)^{*} = \frac{1}{K _{sbcr} T _{frm}} H_{(1)} H_{(1)}^{*} = \frac{1}{K _{sbcr} T _{frm}} A (G ⊙ C)^{T} (G ⊙ C)^{C} A^{*} = \frac{1}{K _{sbcr} T _{frm}} A (G^{*} G ⊚ C^{*} C) A^{*} .

X = ℓ = 1 \sum L_{ch} (δ_{b, ℓ} b_{ℓ}) \circ (δ_{c, ℓ} c_{ℓ}) \circ (δ_{g, ℓ} g_{ℓ}),

X = ℓ = 1 \sum L_{ch} (δ_{b, ℓ} b_{ℓ}) \circ (δ_{c, ℓ} c_{ℓ}) \circ (δ_{g, ℓ} g_{ℓ}),

\begin{split}\mathcal{X}&=\left\llbracket{\mathbf{B}}\mathbf{\Delta}_{{\mathbf{B}}},{\mathbf{C}}\mathbf{\Delta}_{{\mathbf{C}}},{\mathbf{G}}\mathbf{\Delta}_{{\mathbf{G}}}\right\rrbracket,\end{split}

\begin{split}\mathcal{X}&=\left\llbracket{\mathbf{B}}\mathbf{\Delta}_{{\mathbf{B}}},{\mathbf{C}}\mathbf{\Delta}_{{\mathbf{C}}},{\mathbf{G}}\mathbf{\Delta}_{{\mathbf{G}}}\right\rrbracket,\end{split}

\begin{split}\mathcal{X}&=\left\llbracket{\mathbf{B}}\mathbf{\Pi},{\mathbf{C}}\mathbf{\Pi},{\mathbf{G}}\mathbf{\Pi}\right\rrbracket,\end{split}

\begin{split}\mathcal{X}&=\left\llbracket{\mathbf{B}}\mathbf{\Pi},{\mathbf{C}}\mathbf{\Pi},{\mathbf{G}}\mathbf{\Pi}\right\rrbracket,\end{split}

\begin{split}\mathcal{X}&=\left\llbracket{\mathbf{B}}\mathbf{\Pi}\mathbf{\Delta}_{{\mathbf{B}}},{\mathbf{C}}\mathbf{\Pi}\mathbf{\Delta}_{{\mathbf{C}}},{\mathbf{G}}\mathbf{\Pi}\mathbf{\Delta}_{{\mathbf{G}}}\right\rrbracket.\end{split}\

\begin{split}\mathcal{X}&=\left\llbracket{\mathbf{B}}\mathbf{\Pi}\mathbf{\Delta}_{{\mathbf{B}}},{\mathbf{C}}\mathbf{\Pi}\mathbf{\Delta}_{{\mathbf{C}}},{\mathbf{G}}\mathbf{\Pi}\mathbf{\Delta}_{{\mathbf{G}}}\right\rrbracket.\end{split}\

\begin{split}\{\hat{{\mathbf{B}}},\hat{{\mathbf{C}}},\hat{{\mathbf{G}}}\}=\mathop{\mathrm{arg\,min}}_{{\mathring{{\mathbf{B}}}},{\mathring{{\mathbf{C}}}},{\mathring{{\mathbf{G}}}}}\left\|\mathcal{Y}-\left\llbracket{\mathring{{\mathbf{B}}}},{\mathring{{\mathbf{C}}}},{\mathring{{\mathbf{G}}}}\right\rrbracket\right\|.\end{split}

\begin{split}\{\hat{{\mathbf{B}}},\hat{{\mathbf{C}}},\hat{{\mathbf{G}}}\}=\mathop{\mathrm{arg\,min}}_{{\mathring{{\mathbf{B}}}},{\mathring{{\mathbf{C}}}},{\mathring{{\mathbf{G}}}}}\left\|\mathcal{Y}-\left\llbracket{\mathring{{\mathbf{B}}}},{\mathring{{\mathbf{C}}}},{\mathring{{\mathbf{G}}}}\right\rrbracket\right\|.\end{split}

\begin{split}\left\|\mathcal{Y}-\left\llbracket{\mathring{{\mathbf{B}}}},{\mathring{{\mathbf{C}}}},{\mathring{{\mathbf{G}}}}\right\rrbracket\right\|&=\|{\mathbf{Y}}_{(1)}-{\mathring{{\mathbf{B}}}}({\mathring{{\mathbf{G}}}}\odot{\mathring{{\mathbf{C}}}})^{\mathsf{T}}\|_{F}\\ &=\|{\mathbf{Y}}_{(2)}-{\mathring{{\mathbf{C}}}}({\mathring{{\mathbf{G}}}}\odot{\mathring{{\mathbf{B}}}})^{\mathsf{T}}\|_{F}\\ &=\|{\mathbf{Y}}_{(3)}-{\mathring{{\mathbf{G}}}}({\mathring{{\mathbf{C}}}}\odot{\mathring{{\mathbf{B}}}})^{\mathsf{T}}\|_{F},\end{split}

\begin{split}\left\|\mathcal{Y}-\left\llbracket{\mathring{{\mathbf{B}}}},{\mathring{{\mathbf{C}}}},{\mathring{{\mathbf{G}}}}\right\rrbracket\right\|&=\|{\mathbf{Y}}_{(1)}-{\mathring{{\mathbf{B}}}}({\mathring{{\mathbf{G}}}}\odot{\mathring{{\mathbf{C}}}})^{\mathsf{T}}\|_{F}\\ &=\|{\mathbf{Y}}_{(2)}-{\mathring{{\mathbf{C}}}}({\mathring{{\mathbf{G}}}}\odot{\mathring{{\mathbf{B}}}})^{\mathsf{T}}\|_{F}\\ &=\|{\mathbf{Y}}_{(3)}-{\mathring{{\mathbf{G}}}}({\mathring{{\mathbf{C}}}}\odot{\mathring{{\mathbf{B}}}})^{\mathsf{T}}\|_{F},\end{split}

\overset{˚}{B} min ∥ Y_{(1)} - \overset{˚}{B} (\overset{˚}{G} ⊙ \overset{˚}{C})^{T} ∥_{F} .

\overset{˚}{B} min ∥ Y_{(1)} - \overset{˚}{B} (\overset{˚}{G} ⊙ \overset{˚}{C})^{T} ∥_{F} .

\overset{˚}{B} = Y_{(1)} ((\overset{˚}{G} ⊙ \overset{˚}{C})^{T})^{†} = Y_{(1)} ((\overset{˚}{G} ⊙ \overset{˚}{C}) (\overset{˚}{G}^{*} \overset{˚}{G} ⊚ \overset{˚}{C}^{*} \overset{˚}{C})^{†})^{C} .

\overset{˚}{B} = Y_{(1)} ((\overset{˚}{G} ⊙ \overset{˚}{C})^{T})^{†} = Y_{(1)} ((\overset{˚}{G} ⊙ \overset{˚}{C}) (\overset{˚}{G}^{*} \overset{˚}{G} ⊚ \overset{˚}{C}^{*} \overset{˚}{C})^{†})^{C} .

\begin{split}\hat{\mathcal{X}}&=\left\llbracket\hat{{\mathbf{B}}},\hat{{\mathbf{C}}},\hat{{\mathbf{G}}}\right\rrbracket.\end{split}

\begin{split}\hat{\mathcal{X}}&=\left\llbracket\hat{{\mathbf{B}}},\hat{{\mathbf{C}}},\hat{{\mathbf{G}}}\right\rrbracket.\end{split}

\hat{B} \hat{C} \hat{G} = B Π Δ_{B} + Ω_{B}, = C Π Δ_{C} + Ω_{C}, = G Π Δ_{G} + Ω_{G},

\hat{B} \hat{C} \hat{G} = B Π Δ_{B} + Ω_{B}, = C Π Δ_{C} + Ω_{C}, = G Π Δ_{G} + Ω_{G},

\hat{ϕ}_{ℓ} = ar g ϕ min (1 - \frac{∣ b ^ _{ℓ}^{*} W ^{*} a ( ϕ ) ∣ ^{2}}{∥ b ^ _{ℓ} ∥ ^{2} ∥ W ^{*} a ( ϕ ) ∥ ^{2}}),

\hat{ϕ}_{ℓ} = ar g ϕ min (1 - \frac{∣ b ^ _{ℓ}^{*} W ^{*} a ( ϕ ) ∣ ^{2}}{∥ b ^ _{ℓ} ∥ ^{2} ∥ W ^{*} a ( ϕ ) ∥ ^{2}}),

\overset{z}{^}_{ℓ} = ar g z min \frac{a ^{*} ( z ) W ( ∥ b ^ _{ℓ} ∥ ^{2} I - b ^ _{ℓ} b ^ _{ℓ}^{*} ) W ^{*} a ( z )}{∥ b ^ _{ℓ} ∥ ^{2} a ^{*} ( z ) W W ^{*} a ( z )} .

\overset{z}{^}_{ℓ} = ar g z min \frac{a ^{*} ( z ) W ( ∥ b ^ _{ℓ} ∥ ^{2} I - b ^ _{ℓ} b ^ _{ℓ}^{*} ) W ^{*} a ( z )}{∥ b ^ _{ℓ} ∥ ^{2} a ^{*} ( z ) W W ^{*} a ( z )} .

a^{*} (z) Q_{ℓ} a (z) = m = - N_{ant} + 1 \sum N_{ant} - 1 (n_{1} - n_{2} = m \sum [Q_{ℓ}]_{n_{1}, n_{2}}) z^{m} = 0.

a^{*} (z) Q_{ℓ} a (z) = m = - N_{ant} + 1 \sum N_{ant} - 1 (n_{1} - n_{2} = m \sum [Q_{ℓ}]_{n_{1}, n_{2}}) z^{m} = 0.

\hat{δ}_{B, ℓ} = \frac{a ^{*} ( z ^ _{ℓ} ) W b ^ _{ℓ}}{∥ W ^{*} a ( z ^ _{ℓ} ) ∥ ^{2}} .

\hat{δ}_{B, ℓ} = \frac{a ^{*} ( z ^ _{ℓ} ) W b ^ _{ℓ}}{∥ W ^{*} a ( z ^ _{ℓ} ) ∥ ^{2}} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Spatial Channel Covariance Estimation

for Hybrid Architectures Based on Tensor Decompositions

Sungwoo Park, Anum Ali, Nuria González-Prelcic, and Robert W. Heath Jr S. Park, A. Ali, N. González-Prelcic, and R. W. Heath Jr. are with the Wireless Networking and Communication Group (WNCG), Department of Electrical and Computer Engineering, The University of Texas at Austin, TX, 78701 USA. (e-mail: {swpark96,anumali,ngprelcic,rheath}@utexas.edu). This work is supported in part by the National Science Foundation under Grant No. 1514275, and by a gift from Huawei Technologies.

Abstract

Spatial channel covariance information can replace full instantaneous channel state information for the analog precoder design in hybrid analog/digital architectures. Obtaining spatial channel covariance estimation, however, is challenging in the hybrid structure due to the use of fewer radio frequency (RF) chains than the number of antennas. In this paper, we propose a spatial channel covariance estimation method based on higher-order tensor decomposition for spatially sparse time-varying frequency-selective channels. The proposed method leverages the fact that the channel can be represented as a low-rank higher-order tensor. We also derive the Cramér-Rao lower bound on the estimation accuracy of the proposed method. Numerical results and theoretical analysis show that the proposed tensor-based approach achieves higher estimation accuracy in comparison with prior compressive-sensing-based approaches or conventional angle-of-arrival estimation approaches. Simulation results reveal that the proposed approach becomes more beneficial at low signal-to-noise (SNR) region.

I Introduction

Hybrid analog/digital precoding uses a smaller number of RF chains to reduce the number of power-consuming devices like analog-to-digital converters (ADCs) or digital-to-analog converters (DACs). Consequently, the hybrid approach can reduce power consumption and implementation complexity in millimeter wave multiple-input-multiple-output (MIMO) systems [1, 2, 3, 4] and massive MIMO systems [5, 6, 7]. The rate loss incurred by the hybrid architecture is insignificant for spatially sparse channels such as in millimeter wave systems or in suburban/rural areas in sub-6 GHz systems [1, 2, 3, 4, 7].

A main challenge in the hybrid architecture is to configure the analog and digital precoding stages. Many previous methods accomplish this task based on full CSI [1, 4, 8]. These approaches require frequent estimation of the channel, obtained for example via sparse recovery techniques. An alternative is to use only long-term statistical knowledge such as that contained in spatial channel covariance matrices, for the analog precoder design [9, 10, 7, 11]. Once the analog precoder is determined based on the spatial channel covariance, the digital precoder, of a much smaller dimension is designed by using instantaneous full CSI of the low-dimensional effective channel, i.e., the propagation channel combined with the analog precoder. While the accurate estimation of the full CSI is difficult for time-varying frequency-selective channels, the long-term statistical CSI can be efficiently estimated. It was shown in [9, 10, 7, 11] that the hybrid precoding methods based on spatial channel covariance achieve spectral efficiency close to that of the hybrid precoding obtained from full CSI when the channels are spatially sparse.

Although the use of the spatial channel covariance matrix helps the hybrid precoding design to be simpler and more practical, the hybrid architecture makes it difficult to estimate the covariance matrix. Since there is no digital access to the outputs of every antenna, and only the signals combined in an analog way are available at baseband, it is difficult to estimate the spatial covariance of the high dimension channel. Different approaches have been suggested to solve the spatial channel covariance estimation problem under such an environment. In [6], a least-squares-based covariance estimation method was proposed by using time-varying analog beamforming matrices. Since the method does not exploit the sparse channel property, it is not an efficient method for sparse channels, which are of our interest in this paper. The sparse ruler array in [12, 13, 14] and the coprime sensor array [15] can omit measurements on some antenna elements by leveraging the fact that correlations between antenna elements are wide-sense stationary in spatial domain. Although these so-called compressive covariance sensing (CCS) methods can reduce the number of RF chains, the methods have a limitation on the configuration of the number of RF chains and antennas.

The CCS methods were initially developed by using only a subset of antennas. It is, however, possible to extend the work to general hybrid architecture where the analog part is composed of phase shifters and thus is represented as a dense matrix [14]. In this dense sensing matrix case, the CCS methods become closely related to typical compressive sensing (CS) techniques. For example, in [11], the spatial channel estimation method was developed by adopting the time-varying analog combiners used in [6], which are dense matrices. Instead of the least-squares method, one of the well-known conventional vector-type CS techniques, orthogonal matching pursuit (OMP), was adopted to exploit the sparse channel property. It is worthwhile to note that conventional vector-type CS techniques were typically developed for channel estimation [16, 17, 18] but can be extended to spatial channel covariance estimation as well. The vector-type CS techniques for covariance estimation, however, are outperformed by matrix-type CS techniques developed for so-called multiple measurement vector (MMV) problems [19, 20, 21] that enable the joint spare recovery. An advanced spatial channel covariance estimation method based on the MMV approach was proposed in [22] by applying time-varying sensing matrices and exploiting the Hermitian property of the covariance matrix. The performance of the CS-based methods, however, is acceptable only in the moderate or high SNR region.

Besides the CS-based work, conventional AoA estimation methods such as the multiple signal classification (MUSIC) [23] and the estimation of signal parameters via rotational invariance technique (ESPRIT) [24] can also be applied to the spatial channel covariance estimation problem via some modification. In the conventional AoA estimation methods for fully-digital architectures, the spatial channel covariance is directly calculated from the received signal vectors, and the AoA is estimated from the obtained covariance matrix. A covariance estimation based on AoA estimation for the hybrid architecture requires the estimation process in the opposite direction. First, the covariance matrix of the low-dimensional baseband received signal is calculated. Second, the AoAs are estimated from the covariance of the baseband received signal vectors. Finally, the covariance of the actual channel is reconstructed from the estimated AoAs. This basic approach has been adopted for different scenarios with some variations [25, 26]. This approach, however, has a weak point: the estimation accuracy rapidly decreases as the number of channel paths increases toward the number of RF chains. In addition, the methods based on this approach do not work properly when the number of channel paths exceeds that of RF chains.

In this paper, we propose a spatial channel covariance estimation method for the hybrid architecture over uplink time-varying frequency-selective channels. We consider a time division duplex (TDD) system where the estimated covariance over uplink channels can be used for the downlink precoding design at a base station (BS). We represent the channel and the received baseband signal as higher-order tensors. Considering spatially sparse channels, we use the fact that these higher-order tensors have a low tensor rank and their canonical polyadic decomposition (CPD) forms are unique up to a common permutation and scaling of columns under some mild conditions [27, 28, 29]. We also analyze the theoretical performance by adopting the performance metric in [22, 30] that is associated with the dominant eigenvalues and their eigenspaces of the spatial channel covariance matrix. We will call this performance metric the relative precoding efficiency (RPE) in this paper. After showing that the RPE is closely related to the mean squared error (MSE) of the AoA estimation, we derive Cramér-Rao lower bound (CRLB) for the AoA estimation and its associated bound for the performance metric. Using numerical results, we first show that the performance of the proposed method approaches the performance bound as SNR increases. We also show that the lower bound of the tensor-based method is lower than that of the MUSIC-based method, which provides insight into the benefits of using tensor-based methods. Simulations show that the proposed tensor-based method outperforms CS-based methods as well as MUSIC-based methods, and its gain becomes more significant in the low SNR regime.

The rest of the paper is organized as follows. Section II briefly introduces the basics of high-order tensor algebra. Section III provides a system and channel model by using tensor representations. Section IV describes the proposed spatial channel covariance method. The performance metric is analyzed in Section V, and the CRLB is derived in Section VI. In Section VII, the proposed tensor-based work is compared with prior work based on CS or MUSIC. Simulation results are presented in Section VIII, and conclusions are drawn in Section IX. This paper is the journal version of [31] with theoretical analysis added.

Notation: We use the following notation throughout this paper: $a$ is a scalar, ${\mathbf{a}}$ is a vector, ${\mathbf{A}}$ is a matrix, and $\mathcal{A}$ is a tensor. ${\mathbf{A}}^{\mathsf{T}}$ , ${\mathbf{A}}^{\mathsf{C}}$ , ${\mathbf{A}}^{*}$ , and ${\mathbf{A}}^{\dagger}$ are transpose, conjugate, conjugate transpose, and Moore-Penrose pseudoinverse. $[{\mathbf{A}}]_{i,:}$ and $[{\mathbf{A}}]_{:,j}$ are the $i$ -th row and the $j$ -th column of the matrix ${\mathbf{A}}$ . ${\mathbf{A}}\otimes{\mathbf{B}}$ , ${\mathbf{A}}\circledcirc{\mathbf{B}}$ , and ${\mathbf{A}}\odot{\mathbf{B}}$ denote the Kronecker product, the Hadamard product, and the column-wise Khatri-Rao product. ${\mathbf{a}}\circ{\mathbf{b}}$ denotes the outer product, which is also known as the tensor product. $\mathrm{Re}(\mathcal{A})$ and $\mathrm{Im}(\mathcal{A})$ denote the real part and the imaginary part of $\mathcal{A}$ . $\mathrm{diag}({\mathbf{A}})$ is a column vector whose elements are composed of the diagonal elements of ${\mathbf{A}}$ .

II Preliminaries: overview of tensor algebra and canonical polyadic decomposition

In this section, we review the basics of tensor algebra that will be used in this paper. Readers who are interested in more details about tensors can refer to [27, 28, 29] and the references therein. A tensor denotes a multi-dimensional (a.k.a. multi-way or multi-mode) array. The order of a tensor is defined as the number of dimensions of the tensor. A vector and a matrix are special cases of a tensor, i.e., a vector is a tensor of order one, and a matrix is a tensor of order two.

Given an $N$ -th order tensor $\mathcal{X}\in{\mathbb{C}}^{I_{1}\times I_{2}\times\cdots\times I_{N}}$ , let its $(i_{1},i_{2},...,i_{N})$ -th element be denoted by $x_{i_{1}i_{2}\cdots i_{N}}=\mathcal{X}(i_{1},i_{2},\cdots,i_{N})$ . The mode- $n$ fibers of $\mathcal{X}$ are defined as vector-valued sub-tensors obtained by fixing all but one index associated with mode- $n$ , i.e., $\mathcal{X}(i_{1},\cdots,i_{n-1},:,i_{n+1},\cdots,i_{N})$ . The number of mode- $n$ fibers in $\mathcal{X}$ is $\prod_{m=1,m\neq n}^{N}I_{m}$ .

The mode- $n$ matricization (a.k.a. unfolding) is a process that transforms a tensor into a matrix whose columns are composed of mode- $n$ fibers of the tensor. The mode- $n$ unfolding matrix of $\mathcal{X}\in{\mathbb{C}}^{I_{1}\times I_{2}\times\cdots\times I_{N}}$ is denoted by ${\mathbf{X}}_{(n)}\in{\mathbb{C}}^{I_{n}\times I_{1}I_{2}\cdots I_{n-1}I_{n+1}\cdots I_{N}}$ . The tensor $\mathcal{X}(i_{1},i_{2},\cdots,i_{N})$ maps to ${\mathbf{X}}_{(n)}(i_{n},j)$ such that $j=1+\sum_{k=1,k\neq n}^{N}\left(\left(i_{k}-1\right)\prod_{m=1,m\neq n}^{k-1}I_{m}\right)$ .

The mode- $n$ product of a tensor $\mathcal{X}\in{\mathbb{C}}^{I_{1}\times I_{2}\times\cdots\times I_{N}}$ and a matrix ${\mathbf{A}}\in{\mathbb{C}}^{J\times I_{n}}$ is denoted by $\mathcal{X}\times_{n}{\mathbf{A}}$ . Let $\mathcal{Y}=\mathcal{X}\times_{n}{\mathbf{A}}$ . Then, the elements of the tensor $\mathcal{Y}\in{\mathbb{C}}^{I_{1}\times I_{2}\times\cdots I_{n-1}\times J\times I_{n+1}\times\cdots\times I_{N}}$ are given by f $y_{i_{1}i_{2}\cdots i_{n-1}ji_{n+1}\cdots i_{N}}=\sum_{i_{n}=1}^{I_{n}}x_{i_{1}i_{2}\cdots i_{n-1}i_{n}i_{n+1}\cdots i_{N}}a_{ji_{n}}$ . The mode- $n$ product representation $\mathcal{Y}=\mathcal{X}\times_{n}{\mathbf{A}}$ can also be expressed by using the mode- $n$ matricization as

[TABLE]

Given a tensor $\mathcal{X}\in\mathbb{C}^{I_{1}\times I_{2}\times\cdots\times I_{N}}$ and matrices ${\mathbf{A}}^{(n)}\in\mathbb{C}^{J_{n}\times I_{n}}$ for $n=1,...,N$ , their full multilinear product is defined as

[TABLE]

For a special case where $I_{1}=\cdots=I_{N}=R$ and $\mathcal{X}$ is a diagonal tensor $\mathcal{I}\in\mathbb{C}^{R\times R\times\cdots\times R}$ that has zero off-diagonal elements and unit diagonal elements, there exists a simplified notation of the full multilinear product as

[TABLE]

The norm of a tensor is defined as

[TABLE]

which is analogous to the Frobenius norm in the matrix case.

Let $\mathcal{X}\in\mathbb{C}^{I_{1}\times I_{2}\times\cdots\times I_{N}}$ and $\mathcal{Y}\in\mathbb{C}^{J_{1}\times J_{2}\times\cdots\times J_{M}}$ . Then, the outer product (a.k.a. tensor product) of $\mathcal{X}$ and $\mathcal{Y}$ is denoted by $\mathcal{X}\circ\mathcal{Y}$ . Let $\mathcal{Z}=\mathcal{X}\circ\mathcal{Y}$ . Then, the elements of the tensor $\mathcal{Z}\in\mathbb{C}^{I_{1}\times\cdots\times I_{N}\times J_{1}\times\cdots\times J_{M}}$ are given by $z_{i_{1}\cdots i_{N}j_{1}\cdots j_{M}}=x_{i_{1}\cdots i_{N}}y_{j_{1}\cdots j_{M}},\forall i_{1},...,i_{N},j_{1},...,j_{M}$ .

A tensor $\mathcal{X}\in\mathbb{C}^{I_{1}\times I_{2}\times\cdots\times I_{N}}$ is called a rank-one tensor if it can be written as the outer product of vectors as

[TABLE]

where ${\mathbf{x}}^{(n)}\in\mathbb{C}^{I_{n}\times 1},\forall n$ .

The canonical polyadic decomposition (CPD), which is also known as CANDECOMP/PARAFAC decomposition, factorizes a tensor into a sum of component rank-one tensors. The CPD of $\mathcal{X}\in\mathbb{C}^{I_{1}\times I_{2}\times\cdots\times I_{N}}$ has a form

[TABLE]

where ${\mathbf{x}}_{r}^{(n)}\in\mathbb{C}^{I_{n}\times 1}$ for $r=1,...,R$ . The minimum possible value of the number of rank-one tensors that constitute $\mathcal{X}$ , which is $R$ in (6), is called the rank of $\mathcal{X}$ .

III Channel model and system model

Consider a TDD system where a base station with ${N_{\mathrm{ant}}}$ antennas and ${M_{\mathrm{RF}}}(\leq{N_{\mathrm{ant}}})$ RF chains communicates with a mobile station that has a single antenna.

III-A Channel model

We consider a spatially sparse channel that has ${L_{\mathrm{ch}}}$ paths between the BS and mobile user. Let $\tau_{\ell}$ and $\phi_{\ell}$ denote the path delay and AoA of the $\ell^{\mathrm{th}}$ path. Let $g_{t,\ell}$ denote the short-term fading complex path gain of the $\ell^{\mathrm{th}}$ path at frame $t$ . Let $p_{\mathrm{PS}}(\tau)$ denote the low pass filter including pulse shaping and analog filters. We assume a uniform linear array (ULA) with antenna element spacing $d_{\mathrm{a}}$ and signal wavelength $\lambda$ . It is possible to extend the proposed method to a uniform planar array (UPA). The array response vector associated with the $\ell^{\mathrm{th}}$ AoA $\phi_{\ell}$ is expressed as

[TABLE]

Let $T_{s}$ and $N_{\mathrm{CP}}$ be the sampling duration and the cyclic prefix length. We assume that the large-scale fading parameters, $\tau_{\ell}$ ’s and $\phi_{\ell}$ ’s, are constant during the estimation process. By using the delay- $d$ channel model [32, 33, 34], the uplink channel at frame $t$ can be represented as

[TABLE]

By letting $c_{k,\ell}=\sum_{d=0}^{N_{\mathrm{CP}}-1}p_{\mathrm{PS}}(dT_{s}-\tau_{\ell})e^{-\frac{j2\pi(k-1)d}{{K_{\mathrm{sbcr}}}}}$ , the channel frequency response vector can be expressed as

[TABLE]

at frame $t$ and subcarrier $k$ .

III-B System model

Let $s_{t,k}$ be an uplink training symbol at frame $t$ and subcarrier $k$ with $|s_{t,k}|=1$ , and ${\mathbf{z}}_{t,k}\sim\mathcal{CN}(\mathbf{0},\sigma^{2}{\mathbf{I}})$ be a circularly symmetric Gaussian noise. The ${N_{\mathrm{ant}}}\times 1$ received signal vector at each frame and subcarrier can be represented as

[TABLE]

Let ${\mathbf{W}}_{\mathrm{RF}}\in\mathbb{C}^{{N_{\mathrm{ant}}}\times{M_{\mathrm{RF}}}}$ and ${\mathbf{W}}_{\mathrm{BB}}\in\mathbb{C}^{{M_{\mathrm{RF}}}\times{M_{\mathrm{RF}}}}$ be an analog combining matrix and a digital baseband processing matrix. Similar to a sensing matrix in prior CS-based channel estimation work [35, 36], we assume that the elements of ${\mathbf{W}}_{\mathrm{RF}}$ have random phases with a unit amplitude. Let ${\mathbf{W}}$ denote the hybrid combining matrix as ${\mathbf{W}}={\mathbf{W}}_{\mathrm{RF}}{\mathbf{W}}_{\mathrm{BB}}$ . After combining with the hybrid combiner and multiplying by $s_{t,k}^{*}$ , the ${M_{\mathrm{RF}}}\times 1$ baseband received signal vector becomes

[TABLE]

where ${\mathbf{x}}_{t,k}={\mathbf{W}}^{*}{\mathbf{h}}_{t,k}$ denotes the signal part of ${\mathbf{y}}_{t,k}$ , and ${\mathbf{n}}_{t,k}=s_{t,k}^{*}{\mathbf{W}}^{*}{\mathbf{z}}_{t,k}\sim\mathcal{CN}(\mathbf{0},\sigma^{2}{\mathbf{W}}^{*}{\mathbf{W}})$ denotes the noise part. From the viewpoint of the estimator at baseband, the effective noise ${\mathbf{n}}_{t,k}$ becomes colored for an arbitrary hybrid combiner ${\mathbf{W}}$ . It is possible to whiten the effective noise by using the baseband combiner ${\mathbf{W}}_{\mathrm{BB}}=\left({\mathbf{W}}_{\mathrm{RF}}^{*}{\mathbf{W}}_{\mathrm{RF}}\right)^{-\frac{1}{2}}$ . With this choice of ${\mathbf{W}}_{\mathrm{BB}}$ , the hybrid combiner ${\mathbf{W}}$ satisfies ${\mathbf{W}}^{*}{\mathbf{W}}={\mathbf{I}}$ for any ${\mathbf{W}}_{\mathrm{RF}}$ . We will use this unitary hybrid combiner ${\mathbf{W}}$ throughout this paper.

III-C Tensor representation of channels and received signals

In this subsection, we show that the time-varying frequency-selective channel in (9) can be represented as a low-rank third-order tensor. Let ${\mathbf{a}}_{\ell}={\mathbf{a}}(\phi_{\ell})$ , ${\mathbf{c}}_{\ell}=\begin{bmatrix}c_{1,\ell}&\cdots&c_{{K_{\mathrm{sbcr}}},\ell}\end{bmatrix}^{\mathsf{T}}$ , and ${\mathbf{g}}_{\ell}=\begin{bmatrix}g_{1,\ell}&\cdots&g_{{T_{\mathrm{frm}}},\ell}\end{bmatrix}^{\mathsf{T}}$ for $\ell=1,...,{L_{\mathrm{ch}}}$ . With ${\mathbf{a}}_{\ell}$ , ${\mathbf{c}}_{\ell}$ , and ${\mathbf{g}}_{\ell}$ , let us define ${\mathbf{A}}$ , ${\mathbf{C}}$ , and ${\mathbf{G}}$ as ${\mathbf{A}}=\begin{bmatrix}{\mathbf{a}}_{1}&\cdots&{\mathbf{a}}_{L_{\mathrm{ch}}}\end{bmatrix}$ , ${\mathbf{C}}=\begin{bmatrix}{\mathbf{c}}_{1}&\cdots&{\mathbf{c}}_{L_{\mathrm{ch}}}\end{bmatrix}$ , and ${\mathbf{G}}=\begin{bmatrix}{\mathbf{g}}_{1}&\cdots&{\mathbf{g}}_{L_{\mathrm{ch}}}\end{bmatrix}$ .

The ${K_{\mathrm{sbcr}}}{T_{\mathrm{frm}}}$ channel frequency response vectors ${\mathbf{h}}_{t,k}\in\mathbb{C}^{{N_{\mathrm{ant}}}\times 1}$ for $t=1,...,{T_{\mathrm{frm}}}$ and $k=1,...{K_{\mathrm{sbcr}}}$ in (9) can be regarded as ${T_{\mathrm{frm}}}{K_{\mathrm{sbcr}}}$ mode-1 fibers in a third-order tensor $\mathcal{H}\in\mathbb{C}^{{N_{\mathrm{ant}}}\times{K_{\mathrm{sbcr}}}\times{T_{\mathrm{frm}}}}$ as shown in Fig. 1. In CPD form, the rank- ${L_{\mathrm{ch}}}$ third-order tensor is

[TABLE]

In this CPD form in (12), the channel tensor $\mathcal{H}$ is factorized into the three matrices, ${\mathbf{A}}$ , ${\mathbf{C}}$ and ${\mathbf{G}}$ , which are called factor matrices. The mode-1 factor matrix ${\mathbf{A}}$ is associated with antennas in the space domain, the mode-2 factor matrix ${\mathbf{C}}$ with subcarriers in the frequency domain, and the mode-3 factor matrix ${\mathbf{G}}$ with frames in the time domain.

From the channel frequency response tensor model, the received signal at baseband, ${\mathbf{y}}_{t,k},\forall t,k$ in (11) can also be represented as a third-order tensor as

[TABLE]

where $\mathcal{N}$ is the noise tensor whose mode-1 fibers are IID Gaussian vectors with $\mathcal{CN}(\mathbf{0},\sigma^{2}{\mathbf{I}})$ . The signal tensor $\mathcal{X}$ is given by

[TABLE]

The tensor representation in (14) can also be expressed by using a mode-1 matricization as ${\mathbf{X}}_{(1)}={\mathbf{W}}^{*}{\mathbf{H}}_{(1)}$ where ${\mathbf{H}}_{(1)}={\mathbf{A}}\left({\mathbf{G}}\odot{\mathbf{C}}\right)^{\mathsf{T}}$ is the mode-1 matricization of $\mathcal{H}$ shown in Fig. 1. Let ${\mathbf{B}}=\begin{bmatrix}{\mathbf{b}}_{1}&\cdots&{\mathbf{b}}_{L_{\mathrm{ch}}}\end{bmatrix}={\mathbf{W}}^{*}{\mathbf{A}}\in\mathbb{C}^{{M_{\mathrm{RF}}}\times{L_{\mathrm{ch}}}}$ , which can be regarded as the effective array response matrices from the viewpoint of the baseband estimator. The CPD of $\mathcal{X}$ is given by

[TABLE]

Note that both the time-varying frequency-selective channel $\mathcal{H}$ and its associated baseband received signal part $\mathcal{X}$ are third-order tensors with rank ${L_{\mathrm{ch}}}$ .

IV Spatial channel covariance estimation based on tensor decomposition

The spatial channel covariance matrix can be estimated from the sample covariance of ${K_{\mathrm{sbcr}}}{T_{\mathrm{frm}}}$ mode-1 fibers in $\mathcal{H}$ . Since we assume that the elements in $\mathcal{H}$ have zero mean, the sample covariance of the mode-1 fibers becomes

[TABLE]

The goal of the spatial channel covariance estimation for the hybrid architecture is to calculate ${\mathbf{R}}_{{\mathbf{h}}}$ of $\mathcal{H}\in\mathbb{C}^{{N_{\mathrm{ant}}}\times{K_{\mathrm{sbcr}}}\times{T_{\mathrm{frm}}}}$ from the baseband received signal $\mathcal{Y}\in\mathbb{C}^{{M_{\mathrm{RF}}}\times{K_{\mathrm{sbcr}}}\times{T_{\mathrm{frm}}}}$ , which has smaller dimensions than $\mathcal{H}$ . In this section, we propose an estimation method that has three steps. In the first step, the measurement tensor $\mathcal{Y}$ is decomposed into three factor matrices: $\hat{{\mathbf{B}}}$ , $\hat{{\mathbf{C}}}$ , and $\hat{{\mathbf{G}}}$ in a CPD form. Note that the three factor matrices obtained in the first step are strongly related to the actual factor matrices ${\mathbf{B}}$ , ${\mathbf{C}}$ , and ${\mathbf{G}}$ , but are not identical. The relationship between these matrices is further explained in Section IV-A. In the second step, the estimate of the actual factor matrix ${\mathbf{A}}$ of the channel tensor $\mathcal{H}$ is obtained from $\hat{{\mathbf{B}}}$ , which is denoted by $\hat{{\mathbf{A}}}$ . The spatial channel covariance matrix is calculated from $\hat{{\mathbf{A}}}$ , $\hat{{\mathbf{C}}}$ , and $\hat{{\mathbf{G}}}$ in the last step. Each step is explained in detail in the following subsections.

IV-A First step: factorization of $\mathcal{Y}$ in a CPD form

If the factor matrices ${\mathbf{B}}$ , ${\mathbf{C}}$ , and ${\mathbf{G}}$ are given, we can exactly calculate the signal part $\mathcal{X}$ of the measurement tensor $\mathcal{Y}$ . The reverse process does not hold in general; the perfect reconstruction of the original ${\mathbf{B}}$ , ${\mathbf{C}}$ , and ${\mathbf{G}}$ from any $\mathcal{X}$ are not guaranteed. There is, however, a special case where the factor matrices can be reconstructed from $\mathcal{X}$ . If the tensor rank of a higher-order (more than second-order) is low, its CPD is unique under some mild constraints [28]. The uniqueness of the CPD means that there exists only one possible combination of rank-one tensors that sum to the given tensor subject to two types of indeterminacy: scaling and permutation. The scaling indeterminacy means that the columns in each factor matrix can be scaled arbitrarily, i.e., the CPD form in (15) can be rewritten as

[TABLE]

as long as $\delta_{{\mathbf{b}},\ell}\delta_{{\mathbf{c}},\ell}\delta_{{\mathbf{g}},\ell}=1$ for $\ell=1,...,{L_{\mathrm{ch}}}$ . The CPD form in (17) can be expressed in a multilinear product format as

[TABLE]

where $\mathbf{\Delta}_{{\mathbf{B}}}$ , $\mathbf{\Delta}_{{\mathbf{C}}}$ , and $\mathbf{\Delta}_{{\mathbf{G}}}$ are any diagonal matrices satisfying $\mathbf{\Delta}_{{\mathbf{B}}}\mathbf{\Delta}_{{\mathbf{C}}}\mathbf{\Delta}_{{\mathbf{G}}}={\mathbf{I}}$ . The permutation indeterminacy means that the column vectors in each factor matrix can be reordered with a permutation matrix that is common all the modes, i.e., the CPD form in (15) can also be represented as

[TABLE]

for any permutation matrix $\mathbf{\Pi}$ . From (18) and (19), the general form of the CPD of $\mathcal{X}$ becomes

[TABLE]

If we set the indeterminacy issue aside, we can exactly reconstruct its factor matrices by leveraging the uniqueness of the CPD. In this subsection, we focus on how to find a CPD solution. We will discuss how to deal with the indeterminacy in the following subsections.

Given a received signal tensor $\mathcal{Y}$ , the problem of finding its CPD form is formulated as

[TABLE]

There are many known algorithms to solve (21) for CPD. One algorithm to compute the CPD problem in (21) is the alternating least squares (ALS) [27]. By rewriting the objective function in (21) in matrix form as

[TABLE]

the ALS algorithm first finds the mode-1 factor matrix ${\mathring{{\mathbf{B}}}}$ assuming that the mode-2 and mode-3 factor matrices, ${\mathring{{\mathbf{C}}}}$ and ${\mathring{{\mathbf{G}}}}$ are fixed. This subproblem is formulated as

[TABLE]

The solution to (23) can be found by using the least squares algorithm as

[TABLE]

Similar to (23), the mode-2 factor matrix ${\mathring{{\mathbf{C}}}}$ and the mode-3 factor matrix ${\mathring{{\mathbf{G}}}}$ can be calculated by fixing other factor matrices except for its own factor matrix. The ALS algorithm iterates the three steps until the objective function converges. The convergence is guaranteed although the converged solution may not be a global optimum.

The solution after convergence provides an estimate of the CPD form of $\mathcal{X}$ as

[TABLE]

Note that $\hat{\mathcal{X}}$ is an estimate of the actual $\mathcal{X}$ in (20). Due to the scaling and permutation indeterminacy, the estimated factor matrices $\hat{{\mathbf{B}}}$ , $\hat{{\mathbf{C}}}$ , and $\hat{{\mathbf{G}}}$ obtained in the first step are related to the actual factor matrices ${\mathbf{B}}$ , ${\mathbf{C}}$ , and ${\mathbf{G}}$ as

[TABLE]

where $\mathbf{\Delta}_{{\mathbf{B}}}$ , $\mathbf{\Delta}_{{\mathbf{C}}}$ , and $\mathbf{\Delta}_{{\mathbf{G}}}$ are complex-valued diagonal matrices that satisfy $\mathbf{\Delta}_{{\mathbf{B}}}\mathbf{\Delta}_{{\mathbf{C}}}\mathbf{\Delta}_{{\mathbf{G}}}={\mathbf{I}}$ , and $\mathbf{\Omega}_{{\mathbf{B}}}$ , $\mathbf{\Omega}_{{\mathbf{C}}}$ , and $\mathbf{\Omega}_{{\mathbf{G}}}$ denote the estimation errors caused by CPD.

IV-B Second step: estimation of ${\mathbf{A}}$ of the channel tensor $\mathcal{H}$

The goal of the second step is to estimate ${\mathbf{A}}\mathbf{\Pi}$ from $\hat{{\mathbf{B}}}$ that is obtained in the first step as described in Section IV-A. Let $\breve{{\mathbf{A}}}={\mathbf{A}}\mathbf{\Pi}$ . We will show in Section IV-C that we do not need to obtain $\mathbf{\Pi}$ explicitly to estimate the spatial channel covariance. Let ${\mathbf{a}}(\hat{\phi}_{\ell})$ denote the $\ell$ -th column in the estimate of $\breve{{\mathbf{A}}}$ . The problem of finding $\hat{\phi}_{\ell}$ that minimizes the angle between $\hat{{\mathbf{b}}}_{\ell}$ and ${\mathbf{W}}^{*}{\mathbf{a}}(\hat{\phi}_{\ell})$ is represented as

[TABLE]

and its solution can be found by one-dimensional search methods with respect to $\phi$ , which is a continuous variable. Since we assume ULA, the solution can be obtained more efficiently by using a polynomial equation similar to the Root-MUSIC algorithm [37]. By letting $z=e^{\frac{j2\pi d_{\mathrm{a}}\sin(\phi)}{\lambda}}$ , the array response vector ${\mathbf{a}}(\phi)$ can be denoted by ${\mathbf{a}}(z)=\begin{bmatrix}1&z&\cdots&z^{{N_{\mathrm{ant}}}-1}\end{bmatrix}^{T}$ . Then, the optimization problem in (27) is rewritten as

[TABLE]

Let ${\mathbf{Q}}_{\ell}={\mathbf{W}}\left(\|\hat{{\mathbf{b}}}_{\ell}\|^{2}{\mathbf{I}}-\hat{{\mathbf{b}}}_{\ell}\hat{{\mathbf{b}}}_{\ell}^{*}\right){\mathbf{W}}^{*}$ . The numerator in (28) is represented as a polynomial with respect to $z$ and becomes zero in the noiseless case, i.e.,

[TABLE]

Note that if $\omega$ is a root of (29), then $1/\omega^{*}$ is also its root, and there are $({N_{\mathrm{ant}}}-1)$ roots within the unit circle. Let $\omega_{1},...,\omega_{{N_{\mathrm{ant}}}-1}$ denote the $({N_{\mathrm{ant}}}-1)$ roots normalized by their amplitudes. Then, the solution to (28) can be obtained by searching over $z\in\{\omega_{1},...,\omega_{{N_{\mathrm{ant}}}-1}\}$ that has discrete $({N_{\mathrm{ant}}}-1)$ elements, i.e., $\hat{z}_{{\ell}}=\arg\max_{z\in\{\omega_{1},...,\omega_{{N_{\mathrm{ant}}}-1}\}}\frac{|\hat{{\mathbf{b}}}_{\ell}^{*}{\mathbf{W}}^{*}{\mathbf{a}}(z)|^{2}}{\|{\mathbf{W}}^{*}{\mathbf{a}}(z)\|^{2}}$ . After obtaining $\hat{z}_{\ell}$ , the diagonal elements in $\mathbf{\Delta}_{{\mathbf{B}}}$ can be estimated as

[TABLE]

Let $\hat{\mathbf{\Delta}}_{{\mathbf{B}}}$ and $\hat{{\mathbf{A}}}$ denote the estimate of $\mathbf{\Delta}_{{\mathbf{B}}}$ and $\breve{{\mathbf{A}}}$ . Then, $\hat{\mathbf{\Delta}}_{{\mathbf{B}}}$ and $\hat{{\mathbf{A}}}$ can be represented as $\hat{\mathbf{\Delta}}_{{\mathbf{B}}}=\mathrm{diag}\left(\begin{bmatrix}\hat{\delta}_{{\mathbf{B}},1}&\cdots&\hat{\delta}_{{\mathbf{B}},{L_{\mathrm{ch}}}}\end{bmatrix}\right)$ and $\hat{{\mathbf{A}}}=\begin{bmatrix}{\mathbf{a}}(\hat{z}_{1})&\cdots&{\mathbf{a}}(\hat{z}_{L_{\mathrm{ch}}})\end{bmatrix}$ . Note that $\hat{{\mathbf{A}}}$ is the estimate of $\breve{{\mathbf{A}}}={\mathbf{A}}\mathbf{\Pi}$ , in which the permutation matrix $\mathbf{\Pi}$ is not known.

IV-C Third step: estimation of the spatial channel covariance matrix from $\tilde{{\mathbf{A}}}$ , $\hat{{\mathbf{C}}}$ , and $\hat{{\mathbf{G}}}$

While the estimate of ${\mathbf{A}}$ is given by $\hat{{\mathbf{A}}}$ with only permutation indeterminacy at the second step in Section IV-B, the estimated factor matrices $\hat{{\mathbf{G}}}$ and $\hat{{\mathbf{C}}}$ at the first step in Section IV-A still contain both scaling and permutation indeterminacy. Consequently, it is impossible to simply replace ${\mathbf{G}}$ and ${\mathbf{C}}$ by $\hat{{\mathbf{G}}}$ and $\hat{{\mathbf{C}}}$ in (16) without considering $\mathbf{\Delta}_{{\mathbf{C}}}$ , $\mathbf{\Delta}_{{\mathbf{G}}}$ , and $\mathbf{\Pi}$ .

Let $\tilde{{\mathbf{A}}}$ , $\tilde{{\mathbf{C}}}$ and $\tilde{{\mathbf{G}}}$ denote the estimate of the actual ${\mathbf{A}}$ , ${\mathbf{C}}$ and ${\mathbf{G}}$ without any indeterminacy, which are defined as $\tilde{{\mathbf{A}}}=\hat{{\mathbf{A}}}\mathbf{\Pi}^{*}$ , $\tilde{{\mathbf{C}}}=\hat{{\mathbf{C}}}\mathbf{\Delta}_{{\mathbf{C}}}^{-1}\mathbf{\Pi}^{*}$ and $\tilde{{\mathbf{G}}}=\hat{{\mathbf{G}}}\mathbf{\Delta}_{{\mathbf{G}}}^{-1}\mathbf{\Pi}^{*}$ . Note that $\mathbf{\Pi}^{-1}=\mathbf{\Pi}^{*}$ for any permutation matrix $\mathbf{\Pi}$ . Then, the estimate of the sample spatial channel covariance matrix in (16) can be calculated from $\tilde{{\mathbf{A}}}$ , $\tilde{{\mathbf{C}}}$ , and $\tilde{{\mathbf{G}}}$ as

[TABLE]

where $(a)$ comes from the fact that $\mathbf{\Delta}_{{\mathbf{B}}}\mathbf{\Delta}_{{\mathbf{C}}}\mathbf{\Delta}_{{\mathbf{G}}}={\mathbf{I}}$ , and $\hat{\mathbf{\Delta}}_{{\mathbf{B}}}$ is the estimate of $\mathbf{\Delta}_{{\mathbf{B}}}$ .

V Relative precoding efficiency for spatial channel covariance estimation

The MSE or normalized MSE (NMSE) is typically used as a performance metric for channel estimation methods. Other metrics, though, are more relevant for spatial channel covariance estimation. This is because the dominant eigenvalues and their eigenspaces are more useful for hybrid precoding rather than each element in the covariance matrix. In this regard, we adopt the performance metric used in [30, 22], which we call relative precoding efficiency (RPE). Let ${\mathbf{R}}_{{\mathbf{h}}}\in\mathbb{C}^{{N_{\mathrm{ant}}}\times{N_{\mathrm{ant}}}}$ and $\tilde{{\mathbf{R}}}_{{\mathbf{h}}}\in\mathbb{C}^{{N_{\mathrm{ant}}}\times{N_{\mathrm{ant}}}}$ be the spatial channel covariance and its estimate. Let ${\mathbf{U}}$ and $\tilde{{\mathbf{U}}}$ denote the matrices composed of the dominant ${M_{\mathrm{RF}}}$ eigenvectors of ${\mathbf{R}}_{{\mathbf{h}}}$ and $\tilde{{\mathbf{R}}}_{{\mathbf{h}}}$ . The RPE is defined as

[TABLE]

This metric lies between zero and one, i.e., $0\leq\eta\leq 1$ , and higher $\eta$ indicates more accurate estimation. The RPE $\eta$ in (32) is closely related to the relative spectral efficiency of the hybrid beamforming based on $\tilde{{\mathbf{R}}}_{{\mathbf{h}}}$ compared to that of the hybrid beamforming based on ${\mathbf{R}}_{{\mathbf{h}}}$ . Consider a hybrid beamforming system where the analog part is composed of ${\mathbf{U}}$ or $\tilde{{\mathbf{U}}}$ with ${M_{\mathrm{RF}}}$ RF chains as in [9, 10]. For analytical tractability, we ignore the fact that phase shifter are typically used for the analog part. At low SNR region, the achievable rate ratio approximates to

[TABLE]

where $(a)$ comes from the fact that $\ln(1+x)\approx x$ for $x\approx 0$ , and $(b)$ comes from $\mathrm{Tr}({\mathbf{A}}{\mathbf{B}})=\mathrm{Tr}({\mathbf{B}}{\mathbf{A}})$ and ${\mathbf{R}}_{{\mathbf{h}}}=\mathbb{E}[{\mathbf{h}}{\mathbf{h}}^{*}]$ . This result shows that the metric $\eta$ allows us to anticipate how much relative loss will be caused by the estimation error in terms of achievable rate at low SNR region.

The RPE $\eta$ defined in (32) can be analyzed approximately in large antenna array regimes. Let ${\mathbf{A}}\in\mathbb{C}^{{N_{\mathrm{ant}}}\times{L_{\mathrm{ch}}}}$ be a matrix composed of array response vectors and ${\mathbf{R}}_{{\mathbf{g}}}=\mathbb{E}[{\mathbf{g}}{\mathbf{g}}^{*}]\in\mathbb{C}^{{L_{\mathrm{ch}}}\times{L_{\mathrm{ch}}}}$ be the covariance of channel path gains. The spatial channel covariance matrix is represented as

[TABLE]

As ${N_{\mathrm{ant}}}$ becomes large, ${\mathbf{A}}^{*}{\mathbf{A}}\approx{N_{\mathrm{ant}}}{\mathbf{I}}$ , i.e., $\frac{1}{\sqrt{{N_{\mathrm{ant}}}}}{\mathbf{A}}$ becomes semi-unitary asymptotically [38]. If we assume that $g_{t,k}$ are IID complex random variables with zero mean and variance $1/{L_{\mathrm{ch}}}$ for analytical tractability, then (34) can be regarded as the approximate eigenvalue decomposition of ${\mathbf{R}}_{{\mathbf{h}}}$ , i.e., ${\mathbf{U}}\approx\frac{1}{\sqrt{{N_{\mathrm{ant}}}}}{\mathbf{A}}$ . Let $e_{\ell}$ be the AoA estimation error and $\tilde{\phi}_{\ell}=\phi_{\ell}+e_{\ell}$ be the estimated AoA for the $\ell$ -th path. Similar to ${\mathbf{U}}$ , we assume that $\tilde{{\mathbf{U}}}$ approximates to $\frac{1}{\sqrt{{N_{\mathrm{ant}}}}}\tilde{{\mathbf{A}}}$ as ${N_{\mathrm{ant}}}$ increases. Then, the RPE $\eta$ becomes

[TABLE]

where we assume that $e_{\ell}$ is small and $\frac{1}{N}{\mathbf{a}}^{*}(\phi_{\ell_{1}}){\mathbf{a}}(\phi_{\ell_{2}})\approx 0$ for $\ell_{1}\neq\ell_{2}$ .

Let $\kappa_{\ell}$ be defined as

[TABLE]

which approximates to $\kappa_{\ell}\approx\frac{\pi d_{\mathrm{a}}}{\lambda}\cos(\phi_{\ell})e_{\ell}$ for small $e_{\ell}$ . In the ULA case, $\eta$ in (35) is given by

[TABLE]

where $(a)$ comes from the second-order approximation of Maclaurin series for small $\kappa_{\ell}$ . Consequently, $1-\mathbb{E}[\eta]$ approximately becomes

[TABLE]

where the CRLB of the $\phi_{\ell}$ estimation $\mathrm{CRLB}(\phi_{\ell})$ will be derived in the following section.

VI Cramér-Rao Lower Bound for the AoA estimation

In this section, we derive the CRLB of the MSE of AoAs. The basic tool to derive CRLB is based on the method in [39] and [40]. In [39], all the factor matrices are non-structure matrices, i.e., a factor matrix ${\mathbf{F}}\in\mathbb{C}^{{N_{\mathrm{ant}}}\times M}$ is determined by ${N_{\mathrm{ant}}}{M_{\mathrm{RF}}}$ complex-valued variables and has no specific structure. In [40], all the factor matrices have a special structure and are determined by only ${L_{\mathrm{ch}}}$ real-valued variables. In this paper, we derive CRLB in the case where the tensor has the combination of two structured factor matrices, ${\mathbf{A}}$ and ${\mathbf{C}}$ , and one unstructured factor matrix ${\mathbf{G}}$ . We also simplify the complicated CRLB expression to a more compact form.

Focusing on the fact that the channel tensor is determined by ${L_{\mathrm{ch}}}$ AoAs, ${L_{\mathrm{ch}}}$ path delays, and ${L_{\mathrm{ch}}}{T_{\mathrm{frm}}}$ time-varying channel path gains, we define three parameter vectors as $\boldsymbol{\phi}=\begin{bmatrix}\phi_{1}&\cdots&\phi_{{L_{\mathrm{ch}}}}\end{bmatrix}^{\mathsf{T}}$ , $\boldsymbol{\tau}=\begin{bmatrix}\tau_{1}&\cdots&\tau_{{L_{\mathrm{ch}}}}\end{bmatrix}^{\mathsf{T}}$ , and ${\mathbf{g}}=\mathrm{vec}({\mathbf{G}})=\begin{bmatrix}g_{1,1}&\cdots&g_{T,{L_{\mathrm{ch}}}}\end{bmatrix}^{\mathsf{T}}$ . Let $\boldsymbol{\theta}$ denote a column vector that includes all the parameters such that $\boldsymbol{\theta}=\begin{bmatrix}\boldsymbol{\phi}^{\mathsf{T}}&\boldsymbol{\tau}^{\mathsf{T}}&{\mathbf{g}}^{\mathsf{T}}&{\mathbf{g}}^{*}\end{bmatrix}^{\mathsf{T}}$ . Note that ${\mathbf{g}}$ is a complex vector while $\boldsymbol{\phi}$ and $\boldsymbol{\tau}$ are real vectors.

Since the analog combining matrix combined with the baseband post-processing matrix is unitary, the elements in the noise tensor $\mathcal{N}$ become IID circularly symmetric Gaussian with $\mathcal{C}\mathcal{N}(0,\sigma^{2})$ . Consequently, the log-likelihood function of $\boldsymbol{\theta}$ is given by

[TABLE]

Then, the CRLB with respect to the parameter set $\boldsymbol{\theta}$ can be obtained as

[TABLE]

where $\boldsymbol{\Omega}(\boldsymbol{\theta})\in\mathbb{C}^{2{L_{\mathrm{ch}}}({T_{\mathrm{frm}}}+1)\times 2{L_{\mathrm{ch}}}({T_{\mathrm{frm}}}+1)}$ is the complex Fisher information matrix (FIM) defined as

[TABLE]

The FIM $\boldsymbol{\Omega}(\boldsymbol{\theta})$ in (41) is divided into submatrices as

[TABLE]

Each submatrix in (42) is calculated in the following subsections.

VI-A Calculation of $\boldsymbol{\Omega}_{\boldsymbol{\phi}\boldsymbol{\phi}}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times{L_{\mathrm{ch}}}}$

The partial derivative of $f(\boldsymbol{\theta})$ with respect to $\phi_{\ell}$ is given by

[TABLE]

Let $\acute{{\mathbf{B}}}=\begin{bmatrix}\acute{{\mathbf{b}}}_{1}(\phi_{1})&\cdots&\acute{{\mathbf{b}}}_{{L_{\mathrm{ch}}}}(\phi_{{L_{\mathrm{ch}}}})\end{bmatrix}$ be defined as

[TABLE]

and let ${\mathbf{N}}_{(1)}={\mathbf{Y}}_{(1)}-{\mathbf{B}}({\mathbf{G}}\odot{\mathbf{C}})^{\mathsf{T}}$ be the mode-1 unfolding matrix of $\mathcal{N}$ .

By using (43) and (44), the partial derivative of $f(\boldsymbol{\theta})$ with respect to $\boldsymbol{\phi}$ can be represented as

[TABLE]

Let ${\mathbf{V}}_{{\mathbf{B}}}=\frac{1}{\sigma^{2}}({\mathbf{G}}\odot{\mathbf{C}})^{\mathsf{T}}{\mathbf{N}}_{(1)}^{*}\acute{{\mathbf{B}}}$ and ${\mathbf{d}}_{{\mathbf{B}}}=\mathrm{diag}({\mathbf{V}}_{{\mathbf{B}}})$ . Then, ${\mathbf{d}}_{{\mathbf{B}}}$ can be represented as

[TABLE]

by using the fact that $\mathrm{diag}({\mathbf{A}}^{\mathsf{T}}{\mathbf{B}}{\mathbf{C}})=({\mathbf{C}}\odot{\mathbf{A}})^{\mathsf{T}}\mathrm{vec}({\mathbf{B}})$ when ${\mathbf{A}}$ and ${\mathbf{C}}$ have the same number of columns.

Since we consider IID zero mean circularly symmetric complex Gaussian noise, the covariance matrix and the pseudo-covariance matrix of $\mathrm{vec}({\mathbf{N}}_{(1)}^{*})$ in (46) become

[TABLE]

and

[TABLE]

From (47) and (48), the covariance matrix of the complex vector ${\mathbf{d}}_{{\mathbf{B}}}$ becomes

[TABLE]

and the pseudo-covariance matrix becomes $\tilde{{\mathbf{C}}}_{{\mathbf{d}}_{{\mathbf{B}}}}=\mathbb{E}\left[{\mathbf{d}}_{{\mathbf{B}}}{\mathbf{d}}_{{\mathbf{B}}}^{\mathsf{T}}\right]=\mathbf{0}$ . Consequently, the submatrix $\boldsymbol{\Omega}_{\boldsymbol{\phi}\boldsymbol{\phi}}$ becomes

[TABLE]

VI-B Calculation of $\boldsymbol{\Omega}_{\boldsymbol{\tau}\boldsymbol{\tau}}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times{L_{\mathrm{ch}}}}$

Let $\acute{{\mathbf{c}}}_{\ell}(\tau_{\ell})\in\mathbb{C}^{{K_{\mathrm{sbcr}}}\times 1}$ be a vector whose $k$ -th element is defined as

[TABLE]

where $p^{{}^{\prime}}_{\mathrm{PS}}(x)$ is the first derivative of $p_{\mathrm{PS}}(x)$ . Then, the partial derivative of $f(\boldsymbol{\theta})$ with respect to $\boldsymbol{\tau}$ can be obtained as

[TABLE]

where $\acute{{\mathbf{C}}}=\begin{bmatrix}\acute{{\mathbf{c}}}_{1}(\tau_{1})&\cdots&\acute{{\mathbf{c}}}_{{L_{\mathrm{ch}}}}(\tau_{{L_{\mathrm{ch}}}})\end{bmatrix}$ . Let ${\mathbf{N}}_{(2)}={\mathbf{Y}}_{(2)}-{\mathbf{C}}({\mathbf{G}}\odot{\mathbf{B}})^{\mathsf{T}}$ be the mode-2 unfolding matrix of $\mathcal{N}$ , and ${\mathbf{V}}_{{\mathbf{C}}}=\frac{1}{\sigma^{2}}({\mathbf{G}}\odot{\mathbf{B}})^{\mathsf{T}}{\mathbf{N}}_{(2)}^{*}\acute{{\mathbf{C}}}$ . Let ${\mathbf{d}}_{{\mathbf{C}}}$ denote

[TABLE]

Similarly to Section VI-A, the submatrix $\boldsymbol{\Omega}_{\boldsymbol{\tau}\boldsymbol{\tau}}$ can be calculated as

[TABLE]

VI-C Calculation of $\boldsymbol{\Omega}_{{\mathbf{g}}{\mathbf{g}}}\in\mathbb{C}^{{T_{\mathrm{frm}}}{L_{\mathrm{ch}}}\times{T_{\mathrm{frm}}}{L_{\mathrm{ch}}}}$ , $\boldsymbol{\Omega}_{{\mathbf{g}}{\mathbf{g}}^{\mathsf{C}}}\in\mathbb{C}^{{T_{\mathrm{frm}}}{L_{\mathrm{ch}}}\times{T_{\mathrm{frm}}}{L_{\mathrm{ch}}}}$ , and $\boldsymbol{\Omega}_{{\mathbf{g}}^{\mathsf{C}}{\mathbf{g}}^{\mathsf{C}}}\in\mathbb{C}^{{T_{\mathrm{frm}}}{L_{\mathrm{ch}}}\times{T_{\mathrm{frm}}}{L_{\mathrm{ch}}}}$

From the fact that $\mathbb{E}\left[\mathrm{vec}({\mathbf{N}}_{(3)}^{\mathsf{C}})\left(\mathrm{vec}({\mathbf{N}}_{(3)}^{\mathsf{C}})\right)^{*}\right]=\sigma^{2}{\mathbf{I}}_{{M_{\mathrm{RF}}}{K_{\mathrm{sbcr}}}{T_{\mathrm{frm}}}\times{M_{\mathrm{RF}}}{K_{\mathrm{sbcr}}}{T_{\mathrm{frm}}}}$ , the submatrix $\boldsymbol{\Omega}_{{\mathbf{g}}{\mathbf{g}}}$ becomes

[TABLE]

Since $\mathbb{E}\left[\mathrm{vec}({\mathbf{N}}_{(3)}^{\mathsf{C}})\left(\mathrm{vec}({\mathbf{N}}_{(3)}^{\mathsf{C}})\right)^{\mathsf{T}}\right]=\mathbf{0}_{{M_{\mathrm{RF}}}{K_{\mathrm{sbcr}}}{T_{\mathrm{frm}}}\times{M_{\mathrm{RF}}}{K_{\mathrm{sbcr}}}{T_{\mathrm{frm}}}}$ , the submatrix $\boldsymbol{\Omega}_{{\mathbf{g}}{\mathbf{g}}^{\mathsf{C}}}$ is given by

[TABLE]

The submatrix $\boldsymbol{\Omega}_{{\mathbf{g}}^{\mathsf{C}}{\mathbf{g}}^{\mathsf{C}}}$ can be obtained from $\boldsymbol{\Omega}_{{\mathbf{g}}{\mathbf{g}}}$ such that $\boldsymbol{\Omega}_{{\mathbf{g}}^{\mathsf{C}}{\mathbf{g}}^{\mathsf{C}}}=\boldsymbol{\Omega}_{{\mathbf{g}}{\mathbf{g}}}^{\mathsf{C}}$ .

VI-D Calculation of $\boldsymbol{\Omega}_{\boldsymbol{\phi}\boldsymbol{\tau}}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times{L_{\mathrm{ch}}}}$

The submatrix $\boldsymbol{\Omega}_{\boldsymbol{\phi}\boldsymbol{\tau}}$ is given by

[TABLE]

where ${\mathbf{C}}_{{\mathbf{d}}_{{\mathbf{B}}},{\mathbf{d}}_{{\mathbf{C}}}}=\mathbb{E}\left[{\mathbf{d}}_{{\mathbf{B}}}{\mathbf{d}}_{{\mathbf{C}}}^{*}\right]$ and $\tilde{{\mathbf{C}}}_{{\mathbf{d}}_{{\mathbf{B}}},{\mathbf{d}}_{{\mathbf{C}}}}=\mathbb{E}\left[{\mathbf{d}}_{{\mathbf{B}}}{\mathbf{d}}_{{\mathbf{C}}}^{\mathsf{T}}\right]$ .

To calculate ${\mathbf{C}}_{{\mathbf{d}}_{{\mathbf{B}}},{\mathbf{d}}_{{\mathbf{C}}}}$ , let us first start with calculating the cross-covariance matrix of $\mathrm{vec}({\mathbf{N}}_{(1)}^{*})$ and $\mathrm{vec}({\mathbf{N}}_{(2)}^{*})$ , which are associated with the mode-1 and mode-2 unfolding matrix of $\mathcal{N}$ . Let ${\mathbf{e}}_{i}\in\mathbb{C}^{{M_{\mathrm{RF}}}{K_{\mathrm{sbcr}}}{T_{\mathrm{frm}}}\times 1}$ be the $i$ -th unit coordinate vector and ${\mathbf{C}}_{{\mathbf{n}}_{(1)},{\mathbf{n}}_{(2)}}=\mathbb{E}\left[\mathrm{vec}({\mathbf{N}}_{(1)}^{*})\left(\mathrm{vec}({\mathbf{N}}_{(2)}^{*})\right)^{*}\right]$ . Using the fact that $[\mathcal{N}]_{m,k,t}$ is expressed in different ways as

[TABLE]

the cross-covariance matrix ${\mathbf{C}}_{{\mathbf{n}}_{(1)},{\mathbf{n}}_{(2)}}$ can be represented as

[TABLE]

Consequently, ${\mathbf{C}}_{{\mathbf{n}}_{(1)},{\mathbf{n}}_{(2)}}\in\mathbb{C}^{{M_{\mathrm{RF}}}{K_{\mathrm{sbcr}}}{T_{\mathrm{frm}}}\times{M_{\mathrm{RF}}}{K_{\mathrm{sbcr}}}{T_{\mathrm{frm}}}}$ is a matrix that has only ${M_{\mathrm{RF}}}{K_{\mathrm{sbcr}}}{T_{\mathrm{frm}}}$ nonzero elements whose amplitudes are equal to $\sigma^{2}$ . From (59), the cross-covariance matrix of ${\mathbf{d}}_{{\mathbf{B}}}$ and ${\mathbf{d}}_{{\mathbf{C}}}$ can be expressed as

[TABLE]

Since $\tilde{{\mathbf{C}}}_{{\mathbf{n}}_{(1)},{\mathbf{n}}_{(2)}}=\mathbb{E}\left[\mathrm{vec}({\mathbf{N}}_{(1)}^{*})\left(\mathrm{vec}({\mathbf{N}}_{(2)}^{*})\right)^{\mathsf{T}}\right]=\mathbf{0}$ , the pseudo-cross-covariance matrix of ${\mathbf{d}}_{{\mathbf{B}}}$ and ${\mathbf{d}}_{{\mathbf{C}}}$ becomes

[TABLE]

From (60) and (61), the submatrix $\boldsymbol{\Omega}_{\boldsymbol{\phi}\boldsymbol{\tau}}$ in (57) can be rewritten as

[TABLE]

VI-E Calculation of $\boldsymbol{\Omega}_{\boldsymbol{\phi}{\mathbf{g}}}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times{T_{\mathrm{frm}}}{L_{\mathrm{ch}}}}$ and $\boldsymbol{\Omega}_{\boldsymbol{\phi}{\mathbf{g}}^{\mathsf{C}}}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times{T_{\mathrm{frm}}}{L_{\mathrm{ch}}}}$

The submatrix $\boldsymbol{\Omega}_{\boldsymbol{\phi}{\mathbf{g}}}$ is expressed as

[TABLE]

where $\tilde{{\mathbf{C}}}_{{\mathbf{n}}_{(1)},{\mathbf{n}}^{\mathsf{C}}_{(3)}}=\mathbb{E}\left[\mathrm{vec}({\mathbf{N}}_{(1)})\left(\mathrm{vec}({\mathbf{N}}_{(3)})\right)^{\mathsf{T}}\right]=\mathbf{0}$ and

[TABLE]

By using (64), we can further simplify $\boldsymbol{\Omega}_{\boldsymbol{\phi}{\mathbf{g}}}$ in (63) as

[TABLE]

The submatrix $\boldsymbol{\Omega}_{\boldsymbol{\phi}{\mathbf{g}}^{\mathsf{C}}}$ is expressed as $\boldsymbol{\Omega}_{\boldsymbol{\phi}{\mathbf{g}}^{\mathsf{C}}}=\boldsymbol{\Omega}^{\mathsf{C}}_{\boldsymbol{\phi}{\mathbf{g}}}$ .

VI-F Calculation of $\boldsymbol{\Omega}_{\boldsymbol{\tau}{\mathbf{g}}}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times{T_{\mathrm{frm}}}{L_{\mathrm{ch}}}}$ and $\boldsymbol{\Omega}_{\boldsymbol{\tau}{\mathbf{g}}^{\mathsf{C}}}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times{T_{\mathrm{frm}}}{L_{\mathrm{ch}}}}$

Similar to Section VI-E, we can obtain $\boldsymbol{\Omega}_{\boldsymbol{\tau}{\mathbf{g}}}$ and $\boldsymbol{\Omega}_{\boldsymbol{\tau}{\mathbf{g}}^{\mathsf{C}}}$ as

[TABLE]

and $\boldsymbol{\Omega}_{\boldsymbol{\tau}{\mathbf{g}}^{\mathsf{C}}}=\boldsymbol{\Omega}^{\mathsf{C}}_{\boldsymbol{\tau}{\mathbf{g}}}$ .

VI-G CRLB for the $\phi$ estimation

The results of the preceding subsections are summarized as

[TABLE]

Letting $\boldsymbol{\Omega}_{1}=\begin{bmatrix}\boldsymbol{\Omega}_{\boldsymbol{\phi}\boldsymbol{\tau}}&\boldsymbol{\Omega}_{\boldsymbol{\phi}{\mathbf{g}}}&\boldsymbol{\Omega}_{\boldsymbol{\phi}{\mathbf{g}}}^{\mathsf{C}}\end{bmatrix}$ and

[TABLE]

the CRLB for the $\phi_{\ell}$ estimation can be expressed in a compact form as

[TABLE]

by using the Schur complement and the matrix inversion lemma.

VII Comparison with CS-based or MUSIC-based methods

In this section, we explain two other approaches that estimate spatial channel covariance or subspace for comparison:

CS-based methods and 2) MUSIC-based methods.

VII-A Prior work based on CS

The channel frequency response vector in (9) can be represented by using a matrix form as

[TABLE]

where $\mathring{{\mathbf{g}}}_{t}=[{\mathbf{G}}^{\mathsf{T}}]_{:,t}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times 1}$ and $\mathring{{\mathbf{c}}}_{k}=[{\mathbf{C}}^{\mathsf{T}}]_{:,k}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times 1}$ . Let ${\mathbf{A}}_{\mathrm{D}}\in\mathbb{C}^{{N_{\mathrm{ant}}}\times{N_{\mathrm{grid}}}}$ be a dictionary matrix whose ${N_{\mathrm{grid}}}$ columns are composed of the array response vectors associated with a predefined set of AoAs. In the CS framework, the channel model in (70) is rewritten as

[TABLE]

where $\mathring{{\mathbf{g}}}_{\mathrm{D},t}\in\mathbb{C}^{{N_{\mathrm{grid}}}\times 1}$ and $\mathring{{\mathbf{c}}}_{\mathrm{D},k}\in\mathbb{C}^{{N_{\mathrm{grid}}}\times 1}$ are sparse column vectors with ${L_{\mathrm{ch}}}$ nonzero elements of $\mathring{{\mathbf{g}}}_{t}$ and $\mathring{{\mathbf{c}}}_{k}$ in the space domain. The positions of the ${L_{\mathrm{ch}}}$ nonzero elements indicate AoAs, and thus $\mathring{{\mathbf{g}}}_{\mathrm{D},t}$ and $\mathring{{\mathbf{c}}}_{\mathrm{D},k}$ share the same support for all $t$ and $k$ . To exploit the joint sparsity of $\mathring{{\mathbf{g}}}_{\mathrm{D},t}$ and $\mathring{{\mathbf{c}}}_{\mathrm{D},k}$ , we can use CS techniques known as multiple measurement vector (MMV) problems instead of conventional single measurement vector (SMV) problems [19, 20, 21]. While simultaneous orthogonal matching pursuit (SOMP) is known as an adequate algorithm for the general MMV problems, a more advanced CS algorithm was proposed for the spatial channel covariance estimation problem in [22]. We will compare the algorithm in [22] to our tensor-based method.

VII-B Prior work based on MUSIC

In conventional fully-digital architectures, the goal of the MUSIC algorithm is to estimate AoAs from the spatial channel covariance matrix. In other words, the covariance must be known prior to applying the MUSIC algorithms. Note that the spatial channel covariance can be estimated from the covariance of the received signal vectors in fully-digital architectures. Although the spatial channel covariance estimation is not straightforward in the hybrid architectures, the MUSIC algorithm can be applied to the subspace estimation problem for hybrid architectures. It is worthwhile to note that only the subspace can be estimated and the covariance cannot be estimated by using the MUSIC-based approach. Since the subspace is enough for the hybrid precoder design in some cases as in SU-MIMO systems, we will compare our proposed work with the MUSIC-based method in terms of subspace estimation. The overall process for the MUSIC-based method is composed of three steps. First, the sample covariance of the baseband received signal vectors ${\mathbf{y}}_{t,k}$ in (11) is estimated for all $t$ and $k$ as

[TABLE]

where ${\mathbf{R}}_{{\mathbf{n}}}=\frac{1}{{T_{\mathrm{frm}}}{K_{\mathrm{sbcr}}}}\left(\sum_{t=1}^{T_{\mathrm{frm}}}\sum_{k=1}^{K_{\mathrm{sbcr}}}{\mathbf{n}}_{t,k}{\mathbf{n}}_{t,k}^{*}\right)$ . Let the SVD of ${\mathbf{R}}_{{\mathbf{y}}}$ be

[TABLE]

where ${\mathbf{U}}_{{\mathbf{x}}}\in\mathbb{C}^{{M_{\mathrm{RF}}}\times{L_{\mathrm{ch}}}}$ is the signal subspace and ${\mathbf{U}}_{{\mathbf{n}}}\in\mathbb{C}^{{M_{\mathrm{RF}}}\times({M_{\mathrm{RF}}}-{L_{\mathrm{ch}}})}$ is the subspace orthogonal to the signal subspace. Let ${\mathbf{b}}_{{\mathbf{W}}}(\phi)={\mathbf{W}}^{*}{\mathbf{a}}(\phi)$ . Since the RF chains can be regarded as the effective antennas from the viewpoint of the estimator at baseband, the vector ${\mathbf{b}}_{{\mathbf{W}}}(\phi)$ can be considered as the effective array response vector with a reduced size. The second step is to find the ${L_{\mathrm{ch}}}$ highest peaks of the function of $\phi$ defined as

[TABLE]

The final step is to reconstruct the subspace of the channel by using the $\phi_{\ell}$ s for $\ell=1,...,{L_{\mathrm{ch}}}$ that are obtained from the subspace of the baseband received signals. The subspace of the channel is given by the subspace of $\begin{bmatrix}{\mathbf{a}}(\phi_{1})&\cdots{\mathbf{a}}(\phi_{{L_{\mathrm{ch}}}})\end{bmatrix}$ .

The CRLB for $\phi_{\ell}$ s in the MUSIC-based method case is given by [41]

[TABLE]

where $\acute{{\mathbf{B}}}$ is defined in (44) and ${\mathbf{Z}}_{t,k}$ is defined as ${\mathbf{Z}}_{t,k}=\mathrm{diag}\left([{\mathbf{G}}^{\mathsf{T}}]_{:,t}\circledcirc[{\mathbf{C}}^{\mathsf{T}}]_{:,k}\right)$ . The bound of the RPE in the MUSIC-based method case can also be obtained from (38) as in the tensor-based method case.

VIII Simulation results

In this section, we numerically evaluate the CRLB analysis in Section VI. We also present simulation results to demonstrate the performance of the proposed spatial channel covariance estimation algorithms based on CPD of higher-order tensors.

VIII-A Analytical results on CRLB

In Fig. 2, we show the MSE of the estimation of $\boldsymbol{\phi}$ . We also compare the MSE results to the CRLB derived in Section VI for ${N_{\mathrm{ant}}}=64$ , ${M_{\mathrm{RF}}}=8$ , ${L_{\mathrm{ch}}}=6$ , ${T_{\mathrm{frm}}}=20$ , and ${K_{\mathrm{sbcr}}}=128$ . We assume that the path gains $g_{t,\ell}$ ’s are generated from $\mathcal{CN}(0,1/{L_{\mathrm{ch}}})$ and $p_{\mathrm{PS}}(\tau)=\mathrm{sinc}(\tau/T_{s})$ . Since CRLB depends on the deterministic value of AoA and path delays, we set the values as $\boldsymbol{\phi}=[-66,13,49,-7,81,62]$ in degrees and $\boldsymbol{\tau}/T_{s}=[0,4.34,7.13,17.05,21.08,25.73]$ for the purpose of reproduction. We can see that the proposed method achieves the MSE that is close to its theoretical lower bound at moderate and high SNR region.

Fig. 2(a) compares the proposed tensor-based method with the MUSIC-based method in terms of the MSE( $\phi$ ). In addition to numerical results, the analytical results indicate the superiority of the tensor-based method over the MUSIC-based method. The metric $1-\mathbb{E}[\eta]$ is plotted in Fig. 2(b) with the lower bound of its approximation derived in Section V. As shown in Section V, the RPE is closely related to MSE( $\phi$ ) and its CRLB.

Fig. 3 shows the relationship between the CRLB of MSE( $\phi$ ) and other system design parameters such as ${T_{\mathrm{frm}}}$ , ${K_{\mathrm{sbcr}}}$ , and ${M_{\mathrm{RF}}}$ as well as SNR. Unlike the relationship between CRLB( $\phi$ ) and SNR as shown in Fig. 3(d), which is a linear relationship in dB, other parameters impact less on the CRLB as the parameter values increase. As shown in Fig. 3(c), the CRLB of the proposed method approaches that of the MUSIC as ${M_{\mathrm{RF}}}$ increases.

VIII-B Performance evaluation of the spatial channel covariance estimation methods

In this subsection, we evaluate the performance of the spatial channel covariance estimation in terms of the RPE for ${N_{\mathrm{ant}}}=64$ and ${K_{\mathrm{sbcr}}}=128$ . Unlike Section VIII-A, the AoA $\phi_{\ell}$ s are uniformly distributed in $[-180^{\circ},180^{\circ}]$ , and the normalized delay $\tau_{\ell}/T_{s}$ s are uniformly distributed in $[0,N_{\mathrm{CP}}]$ , where the cyclic prefix length $N_{\mathrm{CP}}$ is set to ${K_{\mathrm{sbcr}}}/4$ . The path gains $g_{t,\ell}$ ’s are IID complex Gaussian random variables as $g_{t,\ell}\sim\mathcal{CN}(0,1/{L_{\mathrm{ch}}})$ and $p_{\mathrm{PS}}(\tau)=\mathrm{sinc}(\tau/T_{s})$ .

Fig. 4 compares the proposed method with the two other methods explained in Section VII in terms of the RPE when ${M_{\mathrm{RF}}}=4$ . The two figures, Fig. 4(a), and Fig. 4(b), show the comparison at different SNR values: 0 dB and -10 dB. In each figure, we compare the methods for different ${L_{\mathrm{ch}}}$ values: 2, 3, and 4. It is worthwhile to note that the MUSIC-based method does not work properly if ${L_{\mathrm{ch}}}\geq{M_{\mathrm{RF}}}$ because ${\mathbf{U}}_{{\mathbf{n}}}$ in (74) must have at least one column. Even when ${L_{\mathrm{ch}}}<{M_{\mathrm{RF}}}$ , the figure shows that the performance degradation of the MUSIC-based method is more severe than other two methods as ${L_{\mathrm{ch}}}$ approaches ${M_{\mathrm{RF}}}$ . In addition, as SNR becomes low, the RPE of the MUSIC-based method rapidly decreases compared to that of the tensor-based method. The CS-based method outperforms the MUSIC-based method when ${L_{\mathrm{ch}}}$ relative to ${M_{\mathrm{RF}}}$ is large, but the method shows poor performance at very low SNR. The proposed method based on the higher-tensor CPD shows the best performance in most cases, in particular when ${L_{\mathrm{ch}}}$ is not so small and SNR is low. Fig. 5 shows the results when ${M_{\mathrm{RF}}}=8$ , and the results have the same trend as in Fig. 4 where ${M_{\mathrm{RF}}}=4$ .

Fig. 6 shows the dependency of the RPE on the number of channel paths ${L_{\mathrm{ch}}}$ for each method. Fig. 6(a) reveals that the performance of the MUSIC-based method rapidly decreases after ${L_{\mathrm{ch}}}$ becomes larger than six. In Fig. 6(b), the number of frames ${T_{\mathrm{frm}}}$ is fixed at 20, and the RPEs are compared for different SNR values. For SNR 0 dB and 10 dB, the CS-based method has as a high RPE as the tensor-based method, but its performance is significantly degraded as SNR goes to -10 dB. For any ${L_{\mathrm{ch}}}$ values, we can observe that the proposed tensor-based method has a reasonably high RPE.

Until now, we assumed that the channel has only ${L_{\mathrm{ch}}}$ channel paths as in (8). To evaluate performance for more realistic channels, we consider a clustered channel model [34] that has multiple clusters with multiple subrays as

[TABLE]

The channel has $L_{\mathrm{cluster}}$ clusters whose AoAs are uniformly distributed in $[-180^{\circ},180^{\circ}]$ , and each cluster has $L_{\mathrm{subray}}$ subrays whose AoA offsets are Laplacian distributed with angular spread $2^{\circ}$ . All subrays within a cluster are assumed to have the same delay. Although the rank of the channel tensor $L_{\mathrm{cluster}}L_{\mathrm{subrays}}$ becomes high in general, we can approximate the channel tensor to be a low-rank tensor for spatially sparse channels. Fig. 7 shows the RPE results when ${N_{\mathrm{ant}}}=64$ , ${M_{\mathrm{RF}}}=8$ , ${K_{\mathrm{sbcr}}}=128$ , $L_{\mathrm{subray}}=10$ , and $L_{\mathrm{cluster}}=6,7$ , or $8$ . Instead of using the actual number of channel paths ${L_{\mathrm{ch}}}=L_{\mathrm{cluster}}L_{\mathrm{subray}}$ , the methods based on CPD and CS use ${M_{\mathrm{RF}}}$ for its low-rank (or sparse) approximation while the MUSIC-based method uses $L_{\mathrm{cluster}}$ due to its inherent limitation of using ${M_{\mathrm{RF}}}$ . The figure shows that, although the multiple subrays result in performance loss compared to the single subray case, the proposed method still works properly even in this case.

IX Conclusions

In this paper, we proposed a spatial channel covariance estimation method for the hybrid analog/digital architecture over time-varying frequency-selective channels. Leveraging the fact that a low-rank higher-order tensor can be uniquely decomposed into factor matrices in each domain, we formulated the estimation problem by using high-order tensors and proposed a solution that achieves performance close to its theoretical bound. We also derived the CRLB of the proposed method and showed that compared it is lower than the CRLB of MUSIC-based approach. Numerical results showed that our proposed work outperforms the MUSIC-based work and the CS-based work. The results also showed that our proposed work has a more significant gain in the low SNR regime and the performance degradation caused by the increase in the number of channel paths is less severe than prior work.

Bibliography41

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. Heath, “Spatially sparse precoding in millimeter wave MIMO systems,” IEEE Trans. Wireless Commun. , vol. 13, no. 3, pp. 1499–1513, Mar. 2014.
2[2] W. Roh, J. Seol, J. Park, B. Lee, J. Lee, Y. Kim, J. Cho, K. Cheun, and F. Aryanfar, “Millimeter-wave beamforming as an enabling technology for 5G cellular communications: theoretical feasibility and prototype results,” IEEE Commun. Mag. , vol. 52, no. 2, pp. 106–113, Feb. 2014.
3[3] R. Heath, N. Gonz a ´ ´ a \acute{\textrm{a}} lez-Prelcic, S. Rangan, W. Roh, and A. Sayeed, “An overview of signal processing techniques for millimeter wave MIMO systems,” IEEE J. Sel. Topics Signal Process. , vol. 10, no. 3, pp. 436–453, Apr. 2016.
4[4] F. Sohrabi and W. Yu, “Hybrid digital and analog beamforming design for large-scale antenna arrays,” IEEE J. Sel. Topics Signal Process. , vol. 10, no. 3, pp. 501–513, Apr. 2016.
5[5] X. Zhang, A. Molisch, and S. Kung, “Variable-phase-shift-based RF-baseband codesign for MIMO antenna selection,” IEEE Trans. Signal Process. , vol. 53, no. 11, pp. 4091–4103, Nov. 2005.
6[6] V. Venkateswaran and A. J. van der Veen, “Analog beamforming in MIMO communications with phase shift networks and online channel estimation,” IEEE Trans. Signal Process. , vol. 58, no. 8, pp. 4131–4143, Aug. 2010.
7[7] A. Adhikary, J. Nam, J. Ahn, and G. Caire, “Joint spatial division and multiplexing: The large-scale array regime,” IEEE Trans. Inf. Theory , vol. 59, no. 10, pp. 6441–6463, Oct. 2013.
8[8] J. González-Coma, J. Rodríguez-Fernández, N. González-Prelcic, L. Castedo, and R. Heath, “Channel estimation and hybrid precoding for frequency selective multiuser mmwave MIMO systems,” IEEE J. Sel. Topics Signal Process. , vol. 12, no. 2, pp. 353–367, May 2018.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Spatial Channel Covariance Estimation

Abstract

I Introduction

II Preliminaries: overview of tensor algebra and canonical polyadic decomposition

III Channel model and system model

III-A Channel model

III-B System model

III-C Tensor representation of channels and received signals

IV Spatial channel covariance estimation based on tensor decomposition

IV-A First step: factorization of Y\mathcal{Y}Y in a CPD form

IV-B Second step: estimation of A{\mathbf{A}}A of the channel tensor H\mathcal{H}H

IV-C Third step: estimation of the spatial channel covariance matrix from A~\tilde{{\mathbf{A}}}A~, C^\hat{{\mathbf{C}}}C^, and G^\hat{{\mathbf{G}}}G^

V Relative precoding efficiency for spatial channel covariance estimation

VI Cramér-Rao Lower Bound for the AoA estimation

VI-A Calculation of Ωϕϕ∈CLch×Lch\boldsymbol{\Omega}_{\boldsymbol{\phi}\boldsymbol{\phi}}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times{L_{\mathrm{ch}}}}Ωϕϕ​∈CLch​×Lch​

VI-B Calculation of Ωττ∈CLch×Lch\boldsymbol{\Omega}_{\boldsymbol{\tau}\boldsymbol{\tau}}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times{L_{\mathrm{ch}}}}Ωττ​∈CLch​×Lch​

VI-D Calculation of Ωϕτ∈CLch×Lch\boldsymbol{\Omega}_{\boldsymbol{\phi}\boldsymbol{\tau}}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times{L_{\mathrm{ch}}}}Ωϕτ​∈CLch​×Lch​

VI-G CRLB for the ϕ\phiϕ estimation

VII Comparison with CS-based or MUSIC-based methods

VII-A Prior work based on CS

VII-B Prior work based on MUSIC

VIII Simulation results

VIII-A Analytical results on CRLB

VIII-B *Performance evaluation of the spatial channel covariance estimation methods *

IX Conclusions

IV-A First step: factorization of $\mathcal{Y}$ in a CPD form

IV-B Second step: estimation of ${\mathbf{A}}$ of the channel tensor $\mathcal{H}$

IV-C Third step: estimation of the spatial channel covariance matrix from $\tilde{{\mathbf{A}}}$ , $\hat{{\mathbf{C}}}$ , and $\hat{{\mathbf{G}}}$

VI-A Calculation of $\boldsymbol{\Omega}_{\boldsymbol{\phi}\boldsymbol{\phi}}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times{L_{\mathrm{ch}}}}$

VI-B Calculation of $\boldsymbol{\Omega}_{\boldsymbol{\tau}\boldsymbol{\tau}}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times{L_{\mathrm{ch}}}}$

VI-D Calculation of $\boldsymbol{\Omega}_{\boldsymbol{\phi}\boldsymbol{\tau}}\in\mathbb{C}^{{L_{\mathrm{ch}}}\times{L_{\mathrm{ch}}}}$

VI-G CRLB for the $\phi$ estimation

VIII-B Performance evaluation of the spatial channel covariance estimation methods