Compressive Initial Access and Beamforming Training for Millimeter-Wave   Cellular Systems

Han Yan; Danijela Cabric

arXiv:1901.11220·eess.SP·July 30, 2019·IEEE J. Sel. Top. Signal Process.

Compressive Initial Access and Beamforming Training for Millimeter-Wave Cellular Systems

Han Yan, Danijela Cabric

PDF

TL;DR

This paper introduces a novel initial access and beam training method for millimeter-wave cellular systems that significantly reduces latency and overhead compared to traditional directional approaches, using a quasi-omni sounding beam and joint training algorithm.

Contribution

It proposes a joint initial access and beam training algorithm utilizing quasi-omni sounding beams, reducing latency and overhead in mmWave systems without extra radio resources.

Findings

01

Up to two orders of magnitude reduction in access latency.

02

Effective in various propagation environments and 3D locations.

03

Outperforms hierarchical beam training in simulations.

Abstract

Initial access (IA) is a fundamental physical layer procedure in cellular systems where user equipment (UE) detects nearby base station (BS) as well as acquire synchronization. Due to the necessity of using antenna array in millimeter-wave (mmW) IA, the channel spatial information can also be inferred. The state-of-the-art directional IA (DIA) uses sector sounding beams with limited angular resolution, and thus requires additional dedicated radio resources, access latency and overhead for refined beam training. To remedy the problem of access latency and overhead in DIA, this work proposes to use a quasi-omni pseudorandom sounding beam for IA, and develops a novel algorithm for joint initial access and fine resolution initial beam training without requiring extra radio resources. We provide the analysis of the proposed algorithm miss detection rate under synchronization error, and…

Tables4

Table 1. TABLE I: Nomenclature

Symbol	Explanations
$p$ , $P$	Index and total number of subcarriers
$m$ , $M$	Index and total number of SS bursts
$l, L$	Index and total number of multipaths
$N_{T}$ , $N_{R}$	Number of antenna in BS and UE
$T_{s}$	Sample duration of IA signal
$T_{B}, N_{B}$	Duration and sample number in each SS burst
$T_{F}$	Period of SS bursts
$T_{R}$ , $T_{r}$	Period and duration of CSI-RS
$N_{c}$ , $N_{cp}$	Max. excess delay taps and length of CP
$N_{train}, N_{U}$	Required CSI-RS and UE number
$Δ f$ , $ϵ_{F}$	CFO in [Hz] and normalized in [rad/samp.]
$ϵ_{T}$	Initial TO in UE (number of sample)
$𝐇 [d]$	MIMO channel at $d$ -th delay sample
$𝐚_{T} (θ)$ , $𝐚_{R} (ϕ)$	Spatial responses of BS and UE
$ϕ_{l}, θ_{l}$ , $g_{l}$ , $τ_{l}$	Gain/AoA/AoD/delay of $l$ -th multipath
$α_{l}, β_{l}$	Real and imaginary parts of $g_{l}$
$𝐬, \tilde{𝐬}$ , $\tilde{s} [n]$	F/T domain PSS vector and sequence
$𝐯_{m}, 𝐰_{m}$	RF precoder/combiner of the $m$ -th burst
$z [n]$ , $𝐳_{m}$ , $σ_{n}^{2}$	AWGN sequence, vector, and power
Initial discovery (detection)
$P_{FA}^{⋆}$	Target FA prob. in initial discovery
$P_{MD,PT}$ , $P_{MD,NT}$	MD prob. w/ and w/o perfect timing
$γ_{PT}, η_{PT}$	Detection stat. and TH w/ perfect timing
$γ_{NT}, η_{NT}$	Detection stat. and TH w/ unknown timing
Initial beamforming traning (estimation)
$𝝃$	Unknown parameters in BF training
$𝐲_{m}$	Received OFDM symbols at $m$ -th burst
$𝐝$ , $𝐭$ , $𝐫$	Vectors with candidates delay/AoA/AoD
$G_{D}$ , $G_{T}$ , $G_{R}$	Parameter grid in delay/AoA/AoD est.
$𝐐$ , $𝐅$	ICI matrix and DFT matrix

Table 2. TABLE II: Digital Baseband Operations (complex multiplications)

Function Block	Equation	Operations
Initial Discovery
PSS FIR corr.	(3)	$P N_{B}$
Detection and time sync.	(4) or (5)	$N_{B}$
Initial BF training
Excess. delay est.	(19) and (21)	$P G_{D} + P M$
AoA/AoD est.	(22)	$M G_{T} G_{R}$
CFO est.	(23)	$2 M G_{T} G_{R}$

Table 3. TABLE III: Summary of Simulation settings

Frame Structure
Parameters	Values in Simulations
SS Signal BW	$1 / T_{s} = 57.6$ MHz [7]
Carrier and PSS Length	$P = 128$ [7]
Max Exces. Delay and CP	$N_{c} = {4, 32}$ and $N_{cp} = {8, 32}$
SS Burst Duration	$N_{B} = 1024$ ( $T_{B} = 17.84 μ$ s) [7]
SS Burst Num.	$M = 64$ [7]; $M_{T} = 16$ , $M_{R} = 4$
SS Signal Period	$T_{F} = 20$ ms [7]
Initial Synchronization Offset
Freq. Offset at UE	Up to $\pm$ 5 ppm [44]
Timing Offset at UE	$ϵ_{T} = {170, 960}$ , ( $Δ τ = 3, 17 μ$ s)
STO Search Window	$ϵ_{T,max} = 1024$
Algorithm Design
Target False Alarm	$P_{FA}^{⋆} = 0.01$
Dictionary Size	$G_{D} = 500$ , $G_{T} = 2 N_{T}$ , $G_{R} = 2 N_{R}$

Table 4. TABLE IV: Elements of fisher information matrix

Symb.	Expressions	Symb.	Expressions
$Φ_{ϵ_{F}, ϵ_{F}}$	$\sum_{m = 1}^{M} (C_{d2q} g) {\| 𝐰_{m}^{H} 𝐚_{R} (ϕ) \|}^{2} {\| 𝐯_{m}^{H} 𝐚_{T} (θ) \|}^{2}$	$Φ_{ϵ_{F}, θ}$	$\sum_{m = 1}^{M} (C_{m} g) {\| 𝐰_{m}^{H} 𝐚_{R} (ϕ) \|}^{2} ℜ {[𝐯_{m}^{H} \dot{𝐚_{T}} (θ)] [𝐯_{m}^{H} 𝐚_{T} (θ)]}$
$Φ_{ϵ_{F}, τ}$	$ℜ {\sum_{m = 1}^{M} g {\| 𝐰_{m}^{H} 𝐚_{R} (ϕ) \|}^{2} {\| 𝐯_{m}^{H} 𝐚_{T} (θ) \|}^{2} 𝐟^{H} (τ) 𝐅 {\dot{𝐐}}_{m}^{H} 𝐐_{m} 𝐅^{H} \dot{𝐟} (τ)}$	$Φ_{ϵ_{F}, α}$	$\sum_{m = 1}^{M} [C_{dq, m} ℜ (g)] {\| 𝐰_{m}^{H} 𝐚_{R} (ϕ) \|}^{2} {\| 𝐯_{m}^{H} 𝐚_{T} (θ) \|}^{2}$
$Φ_{ϵ_{F}, β}$	$\sum_{m = 1}^{M} [C_{dq, m} ℑ (g)] {\| 𝐰_{m}^{H} 𝐚_{R} (ϕ) \|}^{2} {\| 𝐯_{m}^{H} 𝐚_{T} (θ) \|}^{2}$	$Φ_{θ, θ}$	$\sum_{m = 1}^{M} (P {\| g \|}^{2}) {\| 𝐰_{m}^{H} 𝐚_{R} (ϕ) \|}^{2} {\| 𝐯_{m}^{H} \dot{𝐚_{T}} (θ) \|}^{2}$
$Φ_{ϕ, ϕ}$	$\sum_{m = 1}^{M} (P {\| g \|}^{2}) {\| 𝐰_{m}^{H} {\dot{𝐚}}_{R} (ϕ) \|}^{2} {\| 𝐯_{m}^{H} 𝐚_{T} (θ) \|}^{2}$	$Φ_{ϕ, θ}$	$ℜ {\sum_{m = 1}^{M} P {\| g \|}^{2} [𝐰_{m}^{H} 𝐚_{R} (ϕ)] [𝐰_{m}^{H} {\dot{𝐚}}_{R} (ϕ)] [𝐯_{m}^{H} {\dot{𝐚}}_{T} (θ)] [𝐯_{m}^{H} 𝐚_{T} (θ)]}$
$Φ_{ϕ, τ}$	$\sum_{m = 1}^{M} C_{df, m} {\| g \|}^{2} ℜ {[𝐰_{m}^{H} {\dot{𝐚}}_{R} (ϕ)] [𝐰_{m}^{H} 𝐚_{R} (ϕ)]} {\| 𝐯_{m}^{H} 𝐚_{T} (θ) \|}^{2}$	$Φ_{ϕ, α}$	$ℜ {\sum_{m = 1}^{M} P g [𝐰_{m}^{H} {\dot{𝐚}}_{R} (ϕ)] [𝐰_{m}^{H} 𝐚_{R} (ϕ)] {\| 𝐯_{m}^{H} 𝐚_{T} (θ) \|}^{2}}$
$Φ_{θ, α}$	$ℜ {\sum_{m = 1}^{M} P g {\| 𝐰_{m}^{H} 𝐚_{R} (ϕ) \|}^{2} [𝐯_{m}^{H} {\dot{𝐚}}_{T} (θ)] [𝐯_{m}^{H} 𝐚_{T} (θ)]}$	$Φ_{ϕ, β}$	$ℜ {\sum_{m = 1}^{M} j g P [𝐰_{m}^{H} {\dot{𝐚}}_{R} (ϕ)] [𝐰_{m}^{H} 𝐚_{R} (ϕ)] {\| 𝐯_{m}^{H} 𝐚_{T} (θ) \|}^{2}}$
$Φ_{θ, β}$	$ℜ {\sum_{m = 1}^{M} j P g {\| 𝐰_{m}^{H} 𝐚_{R} (ϕ) \|}^{2} [𝐯_{m}^{H} {\dot{𝐚}}_{T} (θ)] [𝐯_{m}^{H} 𝐚_{T} (θ)]}$	$Φ_{τ, τ}$	$ℜ {\sum_{m = 1}^{M} {\| g \|}^{2} {\| 𝐰_{m}^{H} 𝐚_{R} (ϕ) \|}^{2} {\| 𝐯_{m}^{H} 𝐚_{T} (θ) \|}^{2} [{\dot{𝐟}}^{H} (τ) 𝐐_{m}^{H} 𝐐_{m} \dot{𝐟} (τ)]}$
$Φ_{τ, α}$	$ℜ {\sum_{m = 1}^{M} g {\| 𝐰_{m}^{H} 𝐚_{R} (ϕ) \|}^{2} {\| 𝐯_{m}^{H} 𝐚_{T} (θ) \|}^{2} [{\dot{𝐟}}^{H} (τ) 𝐐_{m}^{H} 𝐐_{m} 𝐟 (τ)]}$	$Φ_{τ, β}$	$ℜ {\sum_{m = 1}^{M} j g {\| 𝐰_{m}^{H} 𝐚_{R} (ϕ) \|}^{2} {\| 𝐯_{m}^{H} 𝐚_{T} (θ) \|}^{2} [{\dot{𝐟}}^{H} (τ) 𝐐_{m}^{H} 𝐐_{m} 𝐟 (τ)]}$
$Φ_{α, α}$	$\sum_{m = 1}^{M} P {\| 𝐰_{m}^{H} 𝐚_{R} (ϕ) \|}^{2} {\| 𝐯_{m}^{H} 𝐚_{T} (θ) \|}^{2}$	$Φ_{β, β}$	$- \sum_{m = 1}^{M} P {\| 𝐰_{m}^{H} 𝐚_{R} (ϕ) \|}^{2} {\| 𝐯_{m}^{H} 𝐚_{T} (θ) \|}^{2}$

Equations68

y [n] = d = 0 \sum N_{c} - 1 e^{j (ϵ_{F} n + ψ [n])} w^{H} [n] H [d] v [n - d - ϵ_{T}] s [n - d - ϵ_{T}] + w^{H} [n] z [n], n \in [0, N_{F} - 1] .

y [n] = d = 0 \sum N_{c} - 1 e^{j (ϵ_{F} n + ψ [n])} w^{H} [n] H [d] v [n - d - ϵ_{T}] s [n - d - ϵ_{T}] + w^{H} [n] z [n], n \in [0, N_{F} - 1] .

s [n] = ⎩ ⎨ ⎧ s_{zc} [n - (m - 1) N_{B} + P - N_{CP}], s_{zc} [n - (m - 1) N_{B} - N_{CP}], 0, n \in S_{CP, m} n \in S_{PSS, m} otherwise,

s [n] = ⎩ ⎨ ⎧ s_{zc} [n - (m - 1) N_{B} + P - N_{CP}], s_{zc} [n - (m - 1) N_{B} - N_{CP}], 0, n \in S_{CP, m} n \in S_{PSS, m} otherwise,

H [d] = \frac{1}{N _{T} N _{R}} l = 1 \sum L r = 1 \sum R g_{l, r} p_{c} (d T_{s} - τ_{l, r}) a_{R} (ϕ_{l, r}) a_{T}^{H} (θ_{l, r}),

H [d] = \frac{1}{N _{T} N _{R}} l = 1 \sum L r = 1 \sum R g_{l, r} p_{c} (d T_{s} - τ_{l, r}) a_{R} (ϕ_{l, r}) a_{T}^{H} (θ_{l, r}),

\displaystyle\begin{split}\mathcal{H}_{0}:\quad&y[n]=\mathbf{w}^{\text{H}}[n]\mathbf{z}[n],\\ \mathcal{H}_{1}:\quad&y[n]=\sum_{d=0}^{N_{\text{c}}-1}\bigg{(}e^{j\epsilon_{\text{F}}n}\mathbf{w}^{\text{H}}[n]\mathbf{H}[d]\mathbf{v}[n-d-\epsilon_{\text{T}}]\\ &\quad\quad\quad\cdot s[n-d-\epsilon_{\text{T}}]\bigg{)}+\mathbf{w}^{\text{H}}[n]\mathbf{z}[n].\end{split}

\displaystyle\begin{split}\mathcal{H}_{0}:\quad&y[n]=\mathbf{w}^{\text{H}}[n]\mathbf{z}[n],\\ \mathcal{H}_{1}:\quad&y[n]=\sum_{d=0}^{N_{\text{c}}-1}\bigg{(}e^{j\epsilon_{\text{F}}n}\mathbf{w}^{\text{H}}[n]\mathbf{H}[d]\mathbf{v}[n-d-\epsilon_{\text{T}}]\\ &\quad\quad\quad\cdot s[n-d-\epsilon_{\text{T}}]\bigg{)}+\mathbf{w}^{\text{H}}[n]\mathbf{z}[n].\end{split}

\tilde{y} [n] = \frac{1}{P} k = 0 \sum P - 1 y [n + k] s_{zc}^{*} [k] .

\tilde{y} [n] = \frac{1}{P} k = 0 \sum P - 1 y [n + k] s_{zc}^{*} [k] .

γ_{PT} ≜ \frac{1}{M} m = 0 \sum M - 1 k = 0 \sum N_{c} - 1 ∣ \tilde{y} [k + m N_{b}] ∣^{2} H_{0} ≷ H_{1} η_{PT},

γ_{PT} ≜ \frac{1}{M} m = 0 \sum M - 1 k = 0 \sum N_{c} - 1 ∣ \tilde{y} [k + m N_{b}] ∣^{2} H_{0} ≷ H_{1} η_{PT},

γ_{NT} ≜ 0 \leq n < ϵ_{T,max} max \frac{1}{M} m = 0 \sum M - 1 k = 0 \sum N_{c} - 1 ∣ \tilde{y} [n + k + m N_{b}] ∣^{2} H_{0} ≷ H_{1} η_{NT}

γ_{NT} ≜ 0 \leq n < ϵ_{T,max} max \frac{1}{M} m = 0 \sum M - 1 k = 0 \sum N_{c} - 1 ∣ \tilde{y} [n + k + m N_{b}] ∣^{2} H_{0} ≷ H_{1} η_{NT}

\overset{ϵ}{^}_{T} = 0 \leq n < ϵ_{T,max} arg max \frac{1}{M} m = 0 \sum M - 1 k = 0 \sum N_{c - 1} ∣ \tilde{y} [n + k + m N_{b}] ∣^{2} .

\overset{ϵ}{^}_{T} = 0 \leq n < ϵ_{T,max} arg max \frac{1}{M} m = 0 \sum M - 1 k = 0 \sum N_{c - 1} ∣ \tilde{y} [n + k + m N_{b}] ∣^{2} .

η_{E}^{⋆} = σ_{n}^{2} [\frac{N _{c}}{P} + \frac{N _{c}}{M P ^{2}} ξ_{z} (ϵ_{T,max}, P_{FA}^{⋆})],

η_{E}^{⋆} = σ_{n}^{2} [\frac{N _{c}}{P} + \frac{N _{c}}{M P ^{2}} ξ_{z} (ϵ_{T,max}, P_{FA}^{⋆})],

ξ_{E} = ⎩ ⎨ ⎧ Q^{- 1} (P_{FA}^{⋆}), Q^{- 1} (\frac{1}{ϵ _{T,max}}) - \frac{0.78 l n ( - l n ( 1 - P _{FA}^{⋆} ) )}{Q ^{- 1} ( \frac{1}{ϵ _{T,max}} )}, E = PT E = NT,

ξ_{E} = ⎩ ⎨ ⎧ Q^{- 1} (P_{FA}^{⋆}), Q^{- 1} (\frac{1}{ϵ _{T,max}}) - \frac{0.78 l n ( - l n ( 1 - P _{FA}^{⋆} ) )}{Q ^{- 1} ( \frac{1}{ϵ _{T,max}} )}, E = PT E = NT,

P_{MD, E} = Q \frac{κ ( ϵ _{T} , ϵ _{F} ) SNR - \frac{N _{c}}{M P ^{2}} ξ _{z} ( ϵ _{T,max} , P _{FA}^{⋆} )}{\frac{2 κ ^{2} ( ϵ _{T} , ϵ _{F} ) SNR ^{2}}{M} + \frac{N _{c}}{P ^{2} M}},

P_{MD, E} = Q \frac{κ ( ϵ _{T} , ϵ _{F} ) SNR - \frac{N _{c}}{M P ^{2}} ξ _{z} ( ϵ _{T,max} , P _{FA}^{⋆} )}{\frac{2 κ ^{2} ( ϵ _{T} , ϵ _{F} ) SNR ^{2}}{M} + \frac{N _{c}}{P ^{2} M}},

κ (ϵ_{T}, ϵ_{F}) = \frac{2 - ℜ ( e ^{j K (ϵ_{T}) ϵ_{F}} ) - ℜ ( e ^{j [P - K (ϵ_{T})] ϵ_{F}} )}{P ^{2} [ 1 - ℜ ( e ^{j ϵ_{F}} ) ]},

κ (ϵ_{T}, ϵ_{F}) = \frac{2 - ℜ ( e ^{j K (ϵ_{T}) ϵ_{F}} ) - ℜ ( e ^{j [P - K (ϵ_{T})] ϵ_{F}} )}{P ^{2} [ 1 - ℜ ( e ^{j ϵ_{F}} ) ]},

K (ϵ_{T}) = {N_{B} - ϵ_{T}, 0, if N_{B} - P \leq ϵ_{T} < N_{B} otherwise .

K (ϵ_{T}) = {N_{B} - ϵ_{T}, 0, if N_{B} - P \leq ϵ_{T} < N_{B} otherwise .

γ_{DIA} ≜ n max ∣ \tilde{y}_{DIA} [n] ∣^{2} H_{0} ≷ H_{1} η_{DIA}

γ_{DIA} ≜ n max ∣ \tilde{y}_{DIA} [n] ∣^{2} H_{0} ≷ H_{1} η_{DIA}

y = {y_{m}}_{p} = [y_{1}^{T}, \dots, y_{m}^{T}, \dots, y_{M}^{T}]^{T}, y [\overset{ϵ}{^}_{T} + N_{CP} + (p - 1) + (m - 1) N_{B}], p \leq P .

y = {y_{m}}_{p} = [y_{1}^{T}, \dots, y_{m}^{T}, \dots, y_{M}^{T}]^{T}, y [\overset{ϵ}{^}_{T} + N_{CP} + (p - 1) + (m - 1) N_{B}], p \leq P .

y_{m} = x_{m} (ξ) l = 1 \sum L \tilde{g}_{m, l} Q (ϵ_{F}) F^{H} [f (τ_{l}) \circ s] + z_{m},

y_{m} = x_{m} (ξ) l = 1 \sum L \tilde{g}_{m, l} Q (ϵ_{F}) F^{H} [f (τ_{l}) \circ s] + z_{m},

[f (τ_{l})]_{p} = exp [(- j 2 π (p - 1) τ_{l}) / (P T_{s})] .

[f (τ_{l})]_{p} = exp [(- j 2 π (p - 1) τ_{l}) / (P T_{s})] .

\tilde{g}_{l} = \tilde{Q} (ϵ_{F}) \tilde{A}^{H} vec (\tilde{H}_{l}),

\tilde{g}_{l} = \tilde{Q} (ϵ_{F}) \tilde{A}^{H} vec (\tilde{H}_{l}),

\tilde{Q} (ϵ_{F}) = diag ([1, e^{j N_{B} ϵ_{F}}, \dots, e^{j N_{B} (M - 1) ϵ_{F}}]^{T}) .

\tilde{Q} (ϵ_{F}) = diag ([1, e^{j N_{B} ϵ_{F}}, \dots, e^{j N_{B} (M - 1) ϵ_{F}}]^{T}) .

[r]_{k} = - \frac{π}{2} + (k - 1) Δ ϕ, [t]_{k} = - \frac{π}{2} + (k - 1) Δ θ .

[r]_{k} = - \frac{π}{2} + (k - 1) Δ ϕ, [t]_{k} = - \frac{π}{2} + (k - 1) Δ θ .

\overset{q}{^} = 1 \leq q \leq G_{D} arg max ⟨ p_{q}, \overset{ˉ}{y} ⟩ /∥ p_{q} ∥^{2} and \overset{τ}{^} = [d]_{\overset{q}{^}},

\overset{q}{^} = 1 \leq q \leq G_{D} arg max ⟨ p_{q}, \overset{ˉ}{y} ⟩ /∥ p_{q} ∥^{2} and \overset{τ}{^} = [d]_{\overset{q}{^}},

p_{q} ≜ F^{H} [f ([d]_{q}) \circ s],

p_{q} ≜ F^{H} [f ([d]_{q}) \circ s],

\hat{g} = (p_{\overset{q}{^}}^{H} \otimes I_{M}) y,

\hat{g} = (p_{\overset{q}{^}}^{H} \otimes I_{M}) y,

\hat{k} = 1 \leq k \leq G_{R} G_{T} arg max ⟨ \tilde{Q} (\overset{ϵ}{^}_{F, k}) \tilde{a}_{k}, \hat{g} ⟩ /∥ \tilde{a}_{k} ∥^{2},

\hat{k} = 1 \leq k \leq G_{R} G_{T} arg max ⟨ \tilde{Q} (\overset{ϵ}{^}_{F, k}) \tilde{a}_{k}, \hat{g} ⟩ /∥ \tilde{a}_{k} ∥^{2},

\overset{ϵ}{^}_{F, k} = \frac{1}{N _{B}} ∠ (\frac{1}{M - 1} m = 1 \sum M - 1 [\overset{ˉ}{y}_{k}]_{m}^{*} [\overset{ˉ}{y}_{k}]_{m + 1}) .

\overset{ϵ}{^}_{F, k} = \frac{1}{N _{B}} ∠ (\frac{1}{M - 1} m = 1 \sum M - 1 [\overset{ˉ}{y}_{k}]_{m}^{*} [\overset{ˉ}{y}_{k}]_{m + 1}) .

\hat{ϕ} = [r]_{\hat{k}_{R}}, \hat{θ} = [t]_{\hat{k}_{T}}, \overset{ϵ}{^}_{F} = \overset{ϵ}{^}_{F, \hat{k}} .

\hat{ϕ} = [r]_{\hat{k}_{R}}, \hat{θ} = [t]_{\hat{k}_{T}}, \overset{ϵ}{^}_{F} = \overset{ϵ}{^}_{F, \hat{k}} .

e^{(k)} = y - x (\hat{ξ}^{(k)}),

e^{(k)} = y - x (\hat{ξ}^{(k)}),

\overset{g}{^}^{(k + 1)} = (\nabla x_{g})^{†} y,

\overset{g}{^}^{(k + 1)} = (\nabla x_{g})^{†} y,

\overset{x}{^}^{(k + 1)} = \overset{x}{^}^{(k)} + μ_{x} ℜ [(\nabla x_{x})^{†} e^{(k)}], x = {τ, ϵ_{F}, θ, ϕ},

\overset{x}{^}^{(k + 1)} = \overset{x}{^}^{(k)} + μ_{x} ℜ [(\nabla x_{x})^{†} e^{(k)}], x = {τ, ϵ_{F}, θ, ϕ},

var (\hat{ϕ}_{1}) \geq [J^{- 1}]_{2, 2}, var (\hat{θ}_{1}) \geq [J^{- 1}]_{3, 3}

var (\hat{ϕ}_{1}) \geq [J^{- 1}]_{2, 2}, var (\hat{θ}_{1}) \geq [J^{- 1}]_{3, 3}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Compressive Initial Access and Beamforming Training for Millimeter-Wave Cellular Systems

Han Yan, , and Danijela Cabric, Han Yan and Danijela Cabric are with the Electrical and Computer Engineering Department, University of California, Los Angeles, Los Angeles, CA 90095 (e-mail: [email protected]; [email protected]).Part of work was presented in IEEE GlobalSIP 2016 [1]This work was supported by part under NSF grant 1718742.

Abstract

Initial access (IA) is a fundamental physical layer procedure in cellular systems where user equipment (UE) detects nearby base station (BS) as well as acquire synchronization. Due to the necessity of using antenna array in millimeter-wave (mmW) IA, the channel spatial information can also be inferred. The state-of-the-art directional IA (DIA) uses sector sounding beams with limited angular resolution, and thus requires additional dedicated radio resources, access latency and overhead for refined beam training. To remedy the problem of access latency and overhead in DIA, this work proposes to use a quasi-omni pseudorandom sounding beam for IA, and develops a novel algorithm for joint initial access and fine resolution initial beam training without requiring extra radio resources. We provide the analysis of the proposed algorithm miss detection rate under synchronization error, and further derive Cramér-Rao lower bound of angular estimation under frequency offset. Using QuaDRiGa simulator with mmMAGIC model at 28 GHz, the numerical results show that the proposed approach is advantageous to DIA with hierarchical beam training. The proposed algorithm offers up to two order of magnitude access latency saving compared to DIA, when the same discovery, post training SNR, and overhead performance are targeted. This conclusion holds true in various propagation environments and 3D locations of a mmW pico-cell with up to 140m radius.

I Introduction

THE millimeter-wave (mmW) communication is a promising technology for the future cellular network including 5G New Radio (5G-NR) [2]. Due to abundant spectrum, it is expected that the mmW network will support ultra-fast data rate. As shown in both theory and prototypes, mmW system requires beamforming (BF) with large antenna arrays at both base station (BS) and user equipment (UE) to combat severe propagation loss [3]. Significant differences in propagation characteristics and hardware architectures for mmW band compared to microwave band require novel signal processing techniques [4] and physical layer procedures [5].

Initial access (IA) is the fundamental physical layer procedure that allows UE to discover and synchronize with nearby BS before further communication. However, IA for mmW networks brings new challenges and opportunities as compared to IA for sub-6GHz band networks. In mmW system, conventional omni-directional IA with single antenna can not be reliable, and as a result IA needs to leverage transmitter and receiver antenna array to exploit BF gain [6, 7]. A key design challenge in mmW IA is the design of sounding beams for reliable discovery. In addition, beam training is required to achieve high BF gain enabled by large arrays and establish communication link. However, beam training now introduces additional access latency and signaling overhead due to repeated channel probing.

I-A Related works

A number of works investigated various sounding beam designs and signal processing algorithms for mmW IA and beam training. Directional beams for IA and beam training are the most popular and extensively investigated in recent literature [6, 8, 9, 10, 11, 12, 13, 7, 14, 15]. Directional IA (DIA) is first studied in [6] where a Generalized Likelihood Ratio Test (GLRT) is proposed to the solve the cell discovery problem under unknown multiple-input multiple-output (MIMO) channel and synchronization parameters. The authors concluded that the directional IA signal improves discovery range as compared to omni-directional IA. The DIA is further investigated in [8] where overhead and access latency are analyzed. Works [9] and [10] study DIA and its access latency in large networks using stochastic geometry. Impact of beam-width of sounding beams in DIA is researched in [13]. The comparison between omni-directional and DIA is also discussed in [11]. IA using out-of-band information, e.g., location, sub-6GHz measurement, are discussed in [12, 7]. The aforementioned works mostly focused on the overhead and latency for the cell discovery, while beam training is either not discussed or assumed to have coarse resolution [8]. It is common that DIA is paired with directional beam training [14, 15] where hierarchical sounding beams are used in multiple stages to achieve fine angular resolution for each user individually. However, such user-specific hierarchical sounding beams introduce prohibitive latency when a BS is connected to large number of UEs.

The alternative approaches for beam training are based on parametric channel estimation [16, 17, 18, 19, 20, 21, 22, 23]. Exploiting the mmW sparse scattering nature, compressive sensing (CS) approaches have been considered to effectively estimate channel parameters based on channel observations obtained via various sounding beams. Works [16, 17] proposed a CS-based narrowband BF training with pseudorandom sounding beamformers in the downlink, and [18] extended this approach for a wideband channel. Other related works include channel covariance estimation [19, 20, 21] which requires periodic channel observations, and UE centric uplink training [22, 23]. It is worth nothing that all recent works focus on channel estimation alone while assuming perfect cell discovery and synchronization. The 5G-NR frame structure that supports IA is rarely considered, and further the feasibility of joint initial access and CS-based beam training has not been investigated.

There are also recent works that consider some practical aspects of IA. For example, frequency offset robust algorithms in narrowband mmW beam training are reported in [1, 24, 25]. There are several hardware prototypes that consider a practical approach of using received signal strength (RSS) in CS-based beam training. Channel estimation problem without phase measurement is a challenging problem, which was solved via novel signal processing algorithms based on RSS matching pursuit [26], Hash table [27], and sparse phase retrieval [28]. Note that phase free measurements were associated with a particular testbed, and this constraint does not necessarily apply to mmW systems in general. In summary, while IA and beam training algorithms have been extensively studied in the literature, there is a lack of understanding about the theoretical limits and signal processing algorithms that jointly achieve cell discovery and accurate BF training using asynchronous IA signal in mmW frequency selective channel.

I-B Contributions

In this work, we propose to use quasi-omni pseudorandom sounding beams and novel signal processing algorithm to jointly achieve initial cell discovery, synchronization, and fine resolution beam training. More specifically, we provide answers to the following questions.

How to use pseudorandom sounding beams for IA? We propose an energy detection algorithm for initial discovery tailored for pseudorandom sounding beams. We derive the optimal detection threshold, analyze the miss detection probability and the impact of synchronization errors, i.e., carrier frequency offset (CFO) and timing offset (TO).

How to reuse received IA signal for beam training? We propose a novel CS-based beam training algorithm that re-processes the frequency asynchronous IA signals to provide well aligned beam pair. We derive the Cramér-Rao lower bound (CRLB) of asynchronous training in line-of-sight (LOS) channel. We show that proposed algorithm reaches CRLB in LOS and remains effective in non-LOS (NLOS).

What are the benefits of compressive IA? We compare the proposed approach with DIA followed by hierarchical directional beam training. Key performance indicators for both approaches are numerically compared, including discovery rate, post beam training SNR, overhead and access latency. The simulation study based on 5G-NR frame structure and measurement-endorsed 3D 28GHz channel shows that the proposed approach is advantageous to DIA for UEs across wide range of locations in a small cell.

I-C Organizations and notations

The rest of the paper is organized as follows. We start with a brief introduction of 5G-NR frame structure, IA and beam training in Section II. In Section III, we present the system model and problem statement. Section IV includes the proposed algorithm for cell discovery and timing acquisition followed by associated performance analysis. In Section V we present the algorithm and analysis for initial beam training under CFO. The access latency, overhead, and complexity analysis is included in Section VI. The numerical results are presented in Section VII. Open research issues are summarized in Section VIII. Finally, Section IX concludes the paper.

Notations: Scalars, vectors, and matrices are denoted by non-bold, bold lower-case, and bold upper-case letters, respectively. The $(i,j)$ -th element of $\mathbf{A}$ is denoted by $[\mathbf{A}]_{i,j}$ . Conjugate, transpose, Hermitian transpose, and pseudoinverse are denoted by $(.)^{*}$ , $(.)^{\text{T}}$ , $(.)^{\text{H}}$ , and $(.)^{\dagger}$ respectively. The inner product is $\langle\mathbf{a},\mathbf{b}\rangle\triangleq\mathbf{a}^{\text{H}}\mathbf{b}$ . The $l_{2}$ -norm of $\mathbf{h}$ is denoted by $||\mathbf{h}||$ . $\operatorname{\mathrm{diag}}(\mathbf{a})$ aligns vector $\mathbf{a}$ into a diagonal matrix. Kronecker and Hadamard product are denoted as $\otimes$ and $\circ$ , respectively. $\Re(x)$ and $\Im(x)$ are the real and imaginary parts of $x$ , respectively. Set $\mathcal{S}=[a,b]$ contains all integers between $a$ and $b$ .

II Preliminaries: initial access and beam training

In this section, we introduce the mmW physical layer initial access procedure in 5G-NR cellular network. We briefly review the frame structure, synchronization sequences, and directional IA scheme as well as beam training. The reader is referred to work [7] for a more detailed survey.

Frame Structure: Fig. 1 shows the frame structure of 5G-NR. We focus on two functional blocks, namely synchronization signal (SS) burst and channel state information reference signal (CSI-RS). 5G-NR uses orthogonal frequency division multiplexing (OFDM), and the subcarrier spacing is either $120\text{ or }240$ KHz for mmW band. The SS signal is transmitted by a BS with period $T_{\text{F}}$ , typically 20 ms. The SS consists of up to $M=64$ burst blocks. In each one of the burst blocks of duration $T_{\text{B}}$ , a specific sounding beam pair is used by BS and UE. The CSI-RS block with duration $T_{\text{r}}$ is dedicated to specific UE(s) for beam training and tracking. CSI-RS can use all frequency resources, i.e., up to $B_{\text{tot}}$ , and it has periodicity of $T_{\text{R}}$ , an implementation dependent value.

Synchronization Signal: Referring to Fig. 1, each SS burst has 4 OFDM symbols, i.e., primary synchronization signal (PSS), physical broadcast channel (PBCH), and secondary synchronization signal (SSS), followed by another PBCH. PSS is used in cell detection and synchronization, and it is assigned to the middle $P=128$ subcarriers of the first OFDM symbol. The PSS in 4G-LTE is based on Zadoff-Chu (ZC) sequences due to their perfect cyclic-autocorrelation property and their Fourier duals [29], while in 5G-NR PSS is replaced by Maximum Length Sequences (M-sequences) [30]. There are $N_{\text{PPS}}=3$ and 336 unique sequences of PSS and SSS, respectively, and these 1008 combination define the cell identifier (ID) of BS. PBCH carries control information.

Beamformed Initial Access: The BS periodically transmits IA blocks and such signals are processed by UEs which desire to establish the initial access, reconnect after beam misalignment, and search for additional BSs for potential handover. The sounding beams in SS bursts are intended to facilitate multi-antenna processing in BS and UE when no a priori channel information is available. Referring to Fig. 1, BS and UE in the DIA scheme use $M_{\text{T}}$ and $M_{\text{R}}$ transmitter and receiver beams to cover angular space at both ends. One T/Rx beam is used at a time, for all $M=M_{\text{T}}M_{\text{R}}$ SS bursts.

Beam Training: The purpose of beam training is to identify the best beam pairs between BS and UE. The sounding beams in DIA typically have large beam-width and flat response inside angular sectors [31]. Such design covers the angular space of BS and UE within $M$ bursts, but achieves coarse propagation directions estimation [8]. Thus DIA relies on directional beam training to refine angular resolution where BS and UE steer narrow sounding beams within the sectors of interest during CSI-RS periods.

III System Model

This section introduces the system model that adopts the 5G-NR frame structure and problem formulation. All important notations are summarized in Table I.

III-A Received signal model before timing acquisition

Consider a single cell system with a BS equipped with $N_{\text{T}}$ antennas. The BS transmits beamformed IA signal over mmW sparse multipath channel to UEs. We focus on the IA and BF training procedure for a single UE. It is straightforward to extend it to multiple UEs since there is no UE-dependent processing. The UE uses analog array architecture, i.e., phased array, with $N_{\text{R}}$ antennas. We assume that a single stream of IA signal is transmitted by the BS regardless of its architecture.

We first consider the received signal model when a UE searches for BS to initialize the connection. In this procedure, UE follows a periodic SS burst structure and uses predefined receiver beamformers to capture the signal according to [7]. As illustrated in Fig. 2, when the signal is present, the received samples, sampled at $T_{\text{s}}$ , is denoted as

[TABLE]

In the above equation, $\epsilon_{\text{T}}$ is the unknown integer sample TO within range111We assume coarse timing synchronization is available with 10 $\mu$ s level accuracy that corresponds to current LTE-A. Practically it is achievable via GPS clock or non-standalone mmW network [7]. $0\leq\epsilon_{\text{T}}\leq\epsilon_{\text{T,max}}\leq N_{\text{B}}$ , where $\epsilon_{\text{T,max}}$ is the largest offset known to the system and $N_{\text{B}}$ is the number of samples in one SS burst, i.e., $N_{\text{B}}=T_{\text{B}}/T_{\text{s}}$ . The phase measurement error $e^{j(\epsilon_{\text{F}}n+\psi[n])}$ comes from two sources. $\epsilon_{\text{F}}$ is the normalized initial CFO, i.e., $\epsilon_{\text{F}}=2\pi T_{\text{s}}\Delta f$ where $\Delta f$ is absolute CFO in Hz between BS and UE. $\psi[n]$ is the phase noise process in the UE receiver. $N_{\text{c}}$ is the maximum excessive multipath delay in discrete time, based on which cyclic prefix (CP) $N_{\text{cp}}>N_{\text{c}}$ for OFDM symbols is designed. $s[n]$ is the time domain signals of SS bursts. Referring to Fig. 2, we focus on the PSS and treat other symbols as zero [6], i.e.,

[TABLE]

where $\mathcal{S}_{\text{CP},m}\triangleq[(m-1)N_{\text{B}},(m-1)N_{\text{B}}+N_{\text{CP}}-1]$ , $\mathcal{S}_{\text{PSS},m}\triangleq[(m-1)N_{\text{B}}+N_{\text{CP}},(m-1)N_{\text{B}}+N-1]$ are the sets with sample index corresponding to CP and PSS in the $m$ -th burst, respectively. $|s_{\text{zc}}[n]|=1,n\in[0,P-1]$ is the Fourier dual of a known PSS sequence, and $N=P+N_{\text{cp}}$ is the number of samples in PSS including CP. $\mathbf{z}[n]$ is the Additive White Gaussian noise (AWGN) and $\mathbf{z}[n]\sim\mathcal{CN}(0,\sigma_{\text{n}}^{2}\mathbf{I}_{N_{\text{R}}})$ . Vectors $\mathbf{v}[n]$ and $\mathbf{w}[n]$ are beamformers used by BS and UE at instance $n$ , respectively, and they are from a predefined set of IA beam codebook, i.e., $\mathbf{w}[n]\in\mathcal{W}\triangleq\{\mathbf{w}_{1},\cdots,\mathbf{w}_{M}\}$ and $\mathbf{v}[n]\in\mathcal{V}\triangleq\{\mathbf{v}_{1},\cdots,\mathbf{v}_{M}\}$ . BS and UE sequentially use respective beamformers for an interval of $N_{\text{B}}$ samples and switch to the next one in $\mathcal{W}$ and $\mathcal{V}$ , i.e., $\mathbf{w}[n]=\mathbf{w}_{m},\text{ if }\lfloor n/N_{\text{B}}\rfloor=m$ and $\mathbf{v}[n]=\mathbf{v}_{m},\text{ if }\lfloor n/N_{\text{B}}\rfloor=m.$ Beamformer switching is assumed not to introduce latency or phase offset in the transmission and reception. In this work, we focus on the system where each element of $\mathbf{v}_{m}$ and $\mathbf{w}_{m}$ is randomly and independently chosen from a set $\mathcal{S}_{\text{T}}=\left\{\pm 1/\sqrt{N_{\text{T}}},\pm j\sqrt{N_{\text{T}}}\right\},$ and $\mathcal{S}_{\text{R}}=\left\{\pm 1\sqrt{N_{\text{R}}},\pm j\sqrt{N_{\text{R}}}\right\}.$ Such sounding beams require only 4-level phase quantization when steered by phased array and have randomized quasi-omnidirectional beam pattern.

The discrete time MIMO channel at delay $d$ ( $d<N_{\text{c}}$ ) is denoted as $\mathbf{H}[d]\in\mathbb{C}^{N_{\text{R}}\times N_{\text{T}}}$ . Following the extended Saleh Valenzuela (S-V) model in [4], we express $\mathbf{H}[d]$ as

[TABLE]

where $L$ and $R$ are the number of multipath clusters (typically small, $L\leq 4$ [32]) and sub-paths (rays), respectively. Scalar $g_{l,r}$ , $\tau_{l,r}$ , $\theta_{l,r}$ and $\phi_{l,r}$ are the complex gain, excessive delay, angle of departure (AoD) and angle of arrival (AoA) of the $r$ -th sub-path within the $l$ -th cluster, respectively. Function $p_{\text{c}}(t)$ is the time domain response filter due to limited temporal resolution $T_{\text{s}}$ . With half wavelength antenna spacing, the angular response vectors at the BS and UE are denoted as $\mathbf{a}_{\text{T}}(\theta)\in\mathbb{C}^{N_{\text{T}}}$ and $\mathbf{a}_{\text{R}}(\phi)\in\mathbb{C}^{N_{\text{R}}}$ . Their defined $k$ -th element is $[\mathbf{a}_{\text{R}}(\phi)]_{k}=\text{exp}[j\pi(k-1)\sin(\phi)]$ and $[\mathbf{a}_{\text{T}}(\theta)]_{k}=\text{exp}[j\pi(k-1)\sin(\theta)]$ .

Note that the above model aligns with measurement-endorsed mmMAGIC channel model [33] and is used for the system performance evaluation in Section VII. However, for the sake of tractable algorithm design and analysis, the following assumptions and definitions are made.

Assumption 1: BS and UE have ULA with omni-directional element pattern in 2D environment. Intra-cluster AoA, AoD, and delay offsets are zero, i.e., $\sum_{r=1}^{R}g_{l,r}\triangleq g_{l},\phi_{l,r}=\phi_{l}$ , $\theta_{l,r}=\theta_{l}$ , $\tau_{l,r}=\tau_{l},\forall r$ . Index $r$ is omitted in the rest of paper for clarity. The phase error process is solely from CFO i.e., phase noise process is $\psi[n]=0,\forall n$ in (1). The complex path gain $g_{l}$ is deterministic complex value, i.e., $\sum_{l=1}^{L}|g_{l}|^{2}=\sigma^{2}_{\text{g}}$ .

Definition 1: The pre-BF signal to noise ratio (SNR) is defined as $\mathrm{SNR}\triangleq\sigma^{2}_{\text{g}}/\sigma_{\text{n}}^{2}$ .

III-B Problem formulations

We intend to address the following two problems, and their connection to the existing works are remarked.

Problem 1 (Initial Discovery and Timing Acquisition): The UE needs to detect the SS burst from in-band received samples (1). This problem is a binary hypothesis testing with unknown channel $\mathbf{H}[d]$ and synchronization errors $\epsilon_{\text{T}}$ and $\epsilon_{\text{F}}$ .

[TABLE]

In addition, the TO $\epsilon_{\text{T}}$ is estimated at this stage.

Problem 2 (Initial BF Training): The BF training is triggered once UE has detected IA signals. In this stage, UE re-uses the asynchronous signal samples (1) to estimate the AoD and AoA of a path with significant power, say $\theta^{\star}$ and $\phi^{\star}$ , and which are then used as the steering vectors, $\mathbf{v}^{\star}=\mathbf{a}_{\text{T}}(\theta^{\star})$ and $\mathbf{w}^{\star}=\mathbf{a}_{\text{R}}(\phi^{\star})$ , in data communications phase.

Remark 1: The above problems can be solved by DIA and directional beam training with the help of CSI-RS, while our solution relies on processing IA block only. In additional, although Problem 2 has overlap with parametric channel estimation, approaches from this class are not directly comparable. In fact, [18, 22] estimate the entire wideband channel, which facilitates optimal MIMO processing, but the assumptions of perfect synchronization and capability of wideband probing, i.e., $B_{\text{tot}}$ , do not apply in our work. Our goal is to provide well-aligned beam pair within IA block, i.e. without requiring CSI-RS slots. Finally, the cell ID recognition and PBCH decoding are important tasks but are not studied in this work.

IV Initial Discovery and Timing synchronization

This section presents the proposed initial discovery and timing synchronization algorithm followed by their performance analysis.

IV-A Initial discovery and timing synchronization algorithm

The UE processes the received signal using the correlation filter with $s_{\text{zc}}[n]$ , and obtains the detection statistics:

[TABLE]

Intuitively, there are $M$ correlation peaks across $M$ SS bursts. The magnitude of the $m$ -th peaks depends on the array gain of the $m$ -th sounding beamformer, CFO, and TO. Our proposed detector combines energy from all $M$ SS bursts and compares it with the threshold. In contrast to previous works [6, 10, 9] where the detection threshold is a fixed constant, we propose to use the optimal detection threshold based on Neyman-Pearson criterion that meets target false alarm (FA) rate $P^{\star}_{\text{FA}}$ .

To understand the impact of timing synchronization error, we first consider a Genie scenario where the UE has perfect timing (PT) information, i.e., $\epsilon_{\text{T}}=0$ . In this case, the proposed PSS detection scheme is an energy detector over all $M$ bursts. In addition, a sample time window with $N_{\text{c}}$ is used to collect energy from all multipaths. Specifically, the proposed hypothesis testing scheme is expressed as

[TABLE]

where the detection threshold $\eta_{\text{PT}}$ is used to reach false alarm rate constraint such that $\mathrm{Pr}(\gamma_{\text{PT}}>\eta_{\text{PT}}|\mathcal{H}_{0})=P^{\star}_{\text{FA}}$ .

In a practical scenario without initial timing information (NT), i.e., $\epsilon_{\text{T}}\neq 0$ , we propose to use the following detector

[TABLE]

that searches all possible instances within TO window $\epsilon_{\text{T}}\in[0,\epsilon_{\text{T,max}}]$ and uses the highest energy collected for the hypothesis test. The sample index corresponding to the highest energy in (5) is the estimate of TO, namely

[TABLE]

IV-B Performance of initial discovery and timing acquisition

In this subsection, we analyze performance of the proposed discovery algorithm in terms of miss detection rate, and the impact of initial synchronization error $\epsilon_{\text{F}}$ and $\epsilon_{\text{T}}$ . The exact expression is challenging and tedious, if not impossible, and therefore we provide a tight closed-form approximation in the following proposition. To be concise, the subscripts of $\gamma$ and $\eta$ that indicate the timing information assumption are denoted as binary variable $\text{E}\in\{\text{NT},\text{PT}\}$ .

Proposition 1

The optimal threshold of (5) that reaches target FA rate $\mathrm{Pr}(\gamma_{\text{\text{E}}}\geq\eta^{\star}_{\text{E}}|\mathcal{H}_{0})=P^{\star}_{\text{FA}}$ is approximately222Approximation is tight when TO search window size $\epsilon_{\text{T,max}}\geq 100$ .

[TABLE]

where $\xi_{\text{E}}(\epsilon_{\text{T,max}},P^{\star}_{\text{FA}})$ is the threshold adjustment factor dependent on synchronization computed as

[TABLE]

where $\mathrm{Q}(.)$ and $\mathrm{Q}^{-1}(.)$ are Q-function and inverse Q-function, respectively. The associated miss detection (MD) rate $P_{\text{MD},\text{E}}\triangleq\mathrm{Pr}\left(\gamma_{\text{E}}<\eta^{\star}_{\text{E}}|\mathcal{H}_{1}\right)$ using the optimal threshold $\eta^{\star}_{\text{E}}$ is

[TABLE]

where the SNR degradation factor $\kappa(\epsilon_{\text{T}},\epsilon_{\text{F}})$ is defined as

[TABLE]

where $K(\epsilon_{\text{T}})$ is the number of samples during PSS reception that UE switches beamformer due to TO.

[TABLE]

Proof:

See Appendix -A. ∎

Remark 2: $1-P_{\text{MD},\text{NT}}$ is a close approximation of probability that UE detects IA and correctly estimates $\epsilon_{\text{T}}$ .

We gain two main insights from MD expressions (9) corresponding to threshold adjustment factor $\xi_{\text{z}}(\epsilon_{\text{T,max}},P^{\star}_{\text{FA}})$ and SNR degradation factor $\kappa(\epsilon_{\text{F}},\epsilon_{\text{T}})$ . Firstly, the CFO affects MD performance by effectively reducing SNR via term $\kappa(\epsilon_{\text{F}},\epsilon_{\text{T}})$ . Under maximum CFO at UE of $\pm 5$ ppm and typical frame parameters $P,M,N_{\text{c}}$ specified in Section VII, the SNR degradation is bounded by 4 dB, i.e., $10\log_{10}[\kappa(\epsilon_{\text{F}},\epsilon_{\text{T}})]\geq-4\text{dB},\forall\epsilon_{\text{T}}$ . Secondly, the TO has impact on both factors. As seen in (10), the SNR in the detection problem degrades when severe TO exists. In fact, $K(\epsilon_{\text{T}})$ in $\kappa(\epsilon_{\text{F}},\epsilon_{\text{T}})$ models phenomenon that receiver sounding beam switches during the reception of PSS, i.e., $K(\epsilon_{\text{T}})\neq 0$ . In addition, the presence of TO forces system to use peak detection scheme (5) where system searches peak location over a sample window with length $\epsilon_{\text{T,max}}$ , i.e.,the worst case in (5). Under $\mathcal{H}_{0}$ , the algorithm picks strongest noise realization over $\epsilon_{\text{T,max}}$ samples and thus system needs to use higher threshold than in PT scenario, as seen in (7) and (8). Note that such degradation does not depend on the value of $\epsilon_{\text{T}}$ , and the degradation in (9) is not critical with practical maximum TO uncertainty $\epsilon_{\text{T,max}}\leq N_{\text{B}}$ . In summary, synchronization offset does not severely affect discovery performance of the proposed scheme.

IV-C Benchmark approach: directional initial discovery

For completeness, we briefly introduce the benchmark approach using directional sounding beam in initial discovery [6]. The system model of DIA is similar to Section III, except that sounding beamformers $\mathcal{W}$ and $\mathcal{V}$ are codebooks that steer directional sector beams, e.g., [14, 34]. Adapting the approach in [6] for the wideband channel and known PSS in SS burst, the cell discovery in DIA uses the following detector

[TABLE]

where $\gamma_{\text{DIA}}$ and $\gamma_{\text{DIA}}$ are the detection statistic and threshold in DIA. Sequence $\tilde{y}_{\text{DIA}}[n]$ is the correlation output in (3) that corresponds to directional sounding beams. Refer to Fig. 1, the UE detects the burst with maximum power and denotes the index as $m_{\text{DIA}}^{\star}$ which is used in directional beam training.

V Compressive Initial beam training

This section presents the proposed initial access based BF training. We start with signal rearrangement based on information obtained from successful cell discovery and timing acquisition. Then, we introduce the CS problem formulation followed by the proposed algorithm. Finally, we analyze the CRLB of AoA/AoD estimation in LOS.

V-A Signal rearrangement after timing acquisition

The further processing requires correct detection and CP removal, and therefore we make a following assumption.

Assumption 2: In beam training, the received IA signal (1) is correctly detected and TO $\epsilon_{\text{T}}$ is correctly estimated.

The UE first removes CPs of $P$ PSS samples from $y[n]$ corresponding to $M$ bursts and rearranges them into vector

[TABLE]

For notation convenience, in the rest of subsection, we restate the received time domain signal after CP-removal at the $m$ -th SS burst $\mathbf{y}_{m}\in\mathbb{C}^{P}$ according to the model in Section III,

[TABLE]

In the above equation, deterministic vector $\mathbf{x}_{m}(\boldsymbol{\xi})\in\mathbb{C}^{P}$ is observations model of unknown parameters $\boldsymbol{\xi}\triangleq[\epsilon_{\text{F}},\cdots,\theta_{l},\phi_{l},\tau_{l},\alpha_{l},\beta_{l},\cdots]^{\text{T}}$ , where $\alpha_{l}=\Re(g_{l})$ and $\beta_{l}=\Im(g_{l})$ . $\mathbf{z}_{m}\in\mathbb{C}^{P}$ is the vectorized random noise. We also define $\mathbf{x}(\boldsymbol{\xi})=[\mathbf{x}^{\text{T}}_{1}(\boldsymbol{\xi}),\cdots,\mathbf{x}^{\text{T}}_{M}(\boldsymbol{\xi})]^{\text{T}}.$ Specifically, in (14) vector $\mathbf{s}\in\mathbb{C}^{P}$ contains PSS symbols assigned to $P$ subcarriers. Vector $\mathbf{f}(\tau_{l})\in\mathbb{C}^{P}$ is the frequency response corresponding to the excessive delay $\tau_{l}$ of a multipath, i.e., the contribution of $\tau_{l}$ on the $p$ -th subcarrier is

[TABLE]

Matrix $\mathbf{F}\in\mathbb{C}^{P\times P}$ is discrete Fourier transform (DFT) matrix333With absence of CFO, multiple DFT matrix $\mathbf{F}$ in $\mathbf{y}_{m}$ gives frequency domain symbols $\sum_{l=1}^{L}\tilde{g}_{m,l}(\mathbf{f}(\tau_{l})\circ\mathbf{s})+\mathbf{z}_{m}$ .. The effective channel gain is defined as $\tilde{g}_{m,l}=e^{j\epsilon_{\text{F}}N_{\text{B}}(m-1)}g_{l}\mathbf{w}_{m}^{\text{H}}\mathbf{a}_{\text{R}}(\phi_{l})\mathbf{a}^{\text{H}}_{\text{T}}(\theta_{l})\mathbf{v}_{m}$ , and it includes the contribution of phase rotation across SS bursts due to CFO and IA beamformers $\mathbf{v}_{m}$ and $\mathbf{w}_{m}$ . Matrix $\mathbf{Q}(\epsilon_{\text{F}})=\mathrm{diag}\left(\left[1,e^{j\epsilon_{\text{F}}},\cdots,e^{j(P-1)\epsilon_{\text{F}}}\right]^{\text{T}}\right)$ contains phase rotations within an OFDM symbol.

V-B Baseline CS formulation

Directly estimating $\boldsymbol{\xi}$ from (14) via maximum likelihood (ML) requires multi-dimensional search with prohibitive complexity. In the following subsections, we re-formulate Problem 2 to facilitate sequential parameter estimation. With straightforward extension of the derivation in [4, Sec. V], the vector $[\tilde{\mathbf{g}}_{l}]_{m}=\tilde{g}_{m,l}$ in (14) can be re-formulated as

[TABLE]

where $\tilde{\mathbf{A}}\in\mathbb{C}^{G_{\text{T}}G_{\text{R}}\times M}$ is defined by the Hermitian conjugate of its $m$ -th column as $([\tilde{\mathbf{A}}]_{m})^{\text{H}}=(\mathbf{v}_{m}^{\text{T}}\otimes\mathbf{w}_{m}^{\text{H}})(\mathbf{A}^{*}_{\text{T}}\otimes\mathbf{A}_{\text{R}})$ . Note that the above equation is different from [4, Sec. V] which requires $M^{2}$ sounding beam pairs. The matrix $\tilde{\mathbf{Q}}(\epsilon_{\text{F}})\in\mathbb{C}^{M\times M}$ contains the phase rotation in each SS burst due to CFO.

[TABLE]

In fact, matrices $\mathbf{A}_{\text{T}}\in\mathbb{C}^{N_{\text{T}}\times G_{\text{T}}}$ and $\mathbf{A}_{\text{R}}\in\mathbb{C}^{N_{\text{R}}\times G_{\text{R}}}$ are the dictionaries of angular responses with AoAs and AoDs from grids with $G_{\text{T}}$ and $G_{\text{R}}$ uniform steps from $-\pi/2$ to $\pi/2$ , respectively. In order words, the $k$ -th columns in $\mathbf{A}_{\text{T}}$ and $\mathbf{A}_{\text{R}}$ are $[\mathbf{A}_{\text{R}}]_{k}=\mathbf{a}_{\text{R}}([\mathbf{r}]_{k})$ and $[\mathbf{A}_{\text{T}}]_{k}=\mathbf{a}_{\text{T}}([\mathbf{t}]_{k})$ , respectively, where $\{\mathbf{r}\}_{k}$ are the vectors that contain angle candidates.

[TABLE]

Also note that the steps $\Delta\theta$ and $\Delta\phi$ depend on the desired resolution. In this work, $G_{\text{T}}$ and $G_{\text{R}}$ are used as number of steps and namely $\Delta\theta=2\pi/G_{\text{T}}$ and $\Delta\phi=2\pi/G_{\text{R}}$ . Matrix $\tilde{\mathbf{H}}_{l}\in\mathbb{C}^{G_{\text{R}}\times G_{\text{T}}}$ contains the complex path gain of the $l$ -th path, i.e., it has $1$ non-zero element whose location depends on the AoA and AoD of the $l$ -th cluster in the angular grids.

Remark 3: Assuming noisy observation of $\tilde{\mathbf{g}}_{l}$ and zero CFO, (16) reduces to the baseline problem in [4, Sec. V]. However, (14) implies that the former assumption is non-trivial unless $\mathbf{s}=\mathbf{1},\tau_{l}=0,\forall l$ , e.g., [17]. Moreover, algorithm designed with latter assumption is sensitive to CFO [1]. Finally, the AoA/AoD estimators are commonly confined in $\mathbf{r}$ and $\mathbf{t}$ [17]. We address these challenges in the following three subsections.

V-C Effective gain estimation

To address the challenge discussed in Remark 3, we propose the following approach. We treat $\mathbf{Q}(\epsilon_{\text{F}})$ in (14) as identity matrix and estimate delay of dominant path and gain, say $\tau_{l}$ and $\tilde{g}_{m,l}$ , by ML approach. Actually, the proposed algorithm uses sparse impulse support $[\mathbf{d}]_{q}=q\Delta\tau$ to construct a dictionary, where $\Delta\tau=N_{\text{c}}T_{\text{s}}/G_{\text{D}}$ is the step-size of delay candidates. Based on the knowledge of the model (15) and PSS signal $\mathbf{s}$ , the delay estimation is implemented as

[TABLE]

where $\bar{\mathbf{y}}=\sum_{m=1}^{M}\mathbf{y}_{m}/M$ is the received PSS samples averaged over $M$ SS bursts. The vector $\mathbf{p}_{q}$ contains PSS samples when the true delay of dominant path is $[\mathbf{d}]_{q}$ , i.e.,

[TABLE]

where $\mathbf{f}([\mathbf{d}]_{q})$ is by plugging in $[\mathbf{d}]_{q}$ into (15). The estimated delay tap $\hat{\tau}$ enables estimating effective gain of a significant path by

[TABLE]

where $\mathbf{I}_{M}$ is the $M\times M$ identity matrix.

V-D Joint AoA and AoD estimation robust to CFO

The second step uses a modified matching pursuit to solve CS problem (16) from $\hat{\mathbf{g}}$ while incorporating the existence of CFO in $\tilde{\mathbf{Q}}$ . In the conventional matching pursuit step, say the $k$ -th, the anticipated effective channel response corresponding to an AoA and AoD pair, i.e., $[\tilde{\mathbf{A}}]_{k}$ , is used to evaluate inner product with $\mathbf{g}$ [16]. The proposed heuristic treats AoA and AoD as known in the $k$ -th step, and uses the ML estimator of CFO $\hat{\epsilon}_{\text{F},k}$ which is available in closed form. The modified matching pursuit is expressed as

[TABLE]

where $\tilde{\mathbf{a}}_{k}\triangleq[\tilde{\mathbf{A}}]_{k}$ from (16). The matrix $\tilde{\mathbf{Q}}(\hat{\epsilon}_{\text{F},k})$ has structure as (17). The input $\hat{\epsilon}_{\text{F},k}$ is the ML CFO estimator when treating AoA/AoD as they correspond to ones in $[\tilde{\mathbf{A}}]_{k}$ . Specifically, the CFO estimator relies on the estimator in [35] by treating $\bar{\mathbf{y}}_{k}=\tilde{\mathbf{a}}^{*}_{k}\circ\hat{\mathbf{g}}$ as a tone with frequency $\epsilon_{\text{F}}$ .

[TABLE]

Operation $\angle(x)=\tan^{-1}[\Im(x)/\Re(x)]$ evaluates angle based on complex samples. To get estimates of the AoA, AoD, and CFO, index $\hat{k}$ is used to select candidates from grids (18) after the following adjustment $\hat{k}_{\text{R}}=\lfloor(\hat{k}-1)/G_{\text{T}}\rfloor+1$ and $\hat{k}_{\text{T}}=\hat{k}-(\hat{k}_{\text{R}}-1)G_{\text{T}}$ ,

[TABLE]

V-E Off-grid refinement

The aformentioned heuristics provide estimates of delay, AoA, and AoD that are restricted to the grid, i.e., elements of $\mathbf{d},\mathbf{r}$ and $\mathbf{t}$ . Grid refinement is a technique to provide off-grid estimation accuracy. There are several approaches considered in the literature including multi-resolution refinement [36] and the Newtonized gradient refinement [37]. In this work, we propose to use first order descent approach. As initialization of refinement, the estimator from previous steps is saved into $\hat{\boldsymbol{\xi}}^{(k)}$ for $k=1$ . In the $k$ -th iteration, the following error vector is evaluated

[TABLE]

where $\mathbf{y}$ is the received signal after rearrangement as (13), $\mathbf{x}(\hat{\boldsymbol{\xi}}^{(k)})$ is obtained by plugging in estimated parameters into parametric model (14). In other words, $\mathbf{e}^{(k)}$ is the error vector between observed signal sequence and received signal model using current estimates, which is then used to update parameters. The complex gain in iteration $k$ is computed as

[TABLE]

where $\boldsymbol{\nabla}\mathbf{x}_{g}=(\partial\mathbf{x}(\boldsymbol{\xi})/\partial g)|_{\boldsymbol{\xi}=\hat{\boldsymbol{\xi}}^{(k)}}$ is the partial derivative of $\mathbf{x}(\boldsymbol{\xi})$ over parameter $g$ in (14) evaluated at $\hat{\boldsymbol{\xi}}^{(k)}$ . The refinement steps for delay, CFO, AoA, and AoD are moving towards the gradient of their estimators in the previous iterations. For concise notation, in the following equation and paragraph we use $x$ to denote the parameter to be refined, i.e., $x=\{\tau,\epsilon_{\text{F}},\theta,\phi\}$ . The refinement steps are

[TABLE]

where $\mu_{x}$ is the step-size, vector $\boldsymbol{\nabla}\mathbf{x}_{x}=(\partial\mathbf{x}(\boldsymbol{\xi})/\partial x)|_{\boldsymbol{\xi}=\hat{\boldsymbol{\xi}}^{(k)}}$ is the the partial derivative of $\mathbf{x}(\boldsymbol{\xi})$ in (14) over parameter of interest. The above approach iteratively runs by appending updated parameter into $\mathbf{x}(\hat{\boldsymbol{\xi}}^{(k+1)})$ for the next iteration until the error $\|\mathbf{e}^{(k)}\|^{2}$ converges or falls below threshold $\epsilon_{0}$ .

It is worth noting that the proposed approach can be extended to support multi-path training which has been covered by a variety of works in CS-based approaches [16, 18, 20, 21, 22, 24, 25]. However, the main motivation of this work is to showcase and analyze pseudorandom sounding beams in the initial access and initial beam training. Thus the only metric directly comparable to its counterparts [6, 8, 9, 10, 11, 12, 13, 7], namely single path training, is evaluated.

The entire compressive initial beam training algorithm is summarized in Algorithm 1.

V-F Performance bound of initial BF training in LOS

In this subsection, we provide lower bound of AoA/AoD estimation variance in pure LOS scenario, namely CRLB in joint estimating $\boldsymbol{\xi}=[\epsilon_{\text{F}},\theta_{1},\phi_{1},\tau_{1},\alpha_{1},\beta_{1}]^{\text{T}}$ . Based on (14), the likelihood function is $\mathrm{Pr}(\mathbf{y};\boldsymbol{\xi})=(2\pi\sigma_{\text{n}}^{2MP})^{-1}\text{exp}\left(-(\|\mathbf{y}-\mathbf{x}(\boldsymbol{\xi})\|^{2})/(\sigma_{\text{n}}^{2})\right)$ . The log-likelihood function is $L(\mathbf{y};\boldsymbol{\xi})\triangleq\ln[\mathrm{Pr}(\mathbf{y};\boldsymbol{\xi})]$ . The lower bound of estimation variance is given in the following proposition.

Proposition 2

The CRLB of AoA/AoD estimation in the compressive initial BF training stage in LOS environment is

[TABLE]

where $\mathbf{J}\triangleq\partial^{2}L(\mathbf{y};\boldsymbol{\xi})/\partial\boldsymbol{\xi}^{2}$ is the Fisher Information Matrix (FIM) whose expressions are listed in Appendix -B.

Proof:

See Appendix -B. ∎

V-G Benchmark approach: hierarchical directional BF training

The directional beams in SS burst allow BS and UE to coarsely estimate the propagation directions [38]. Although approach in [38] is not tailored for wideband channel with synchronization offset, it relies on RSS measurement within burst and therefore it is robust to the model mismatch. Using the SS burst index that corresponds to the maximum received power, the system uses the knowledge of directional sounding beams to infer channel propagation angles. Specifically, as illustrated in Fig. 1, the estimated $\theta^{\star}$ and $\phi^{\star}$ are the centers of the $\hat{m}_{\text{T}}$ -th and $\hat{m}_{\text{R}}$ -th sounding beams in BS and UE [38], respectively. Note that the estimated angle sector indices $\hat{m}_{\text{T}}$ and $\hat{m}_{\text{R}}$ are computed from the SS burst index $m_{\text{DIA}}^{\star}$ in (12), i.e., $\hat{m}_{\text{R}}=\lfloor(m_{\text{DIA}}^{\star}-1)/M_{\text{T}}\rfloor+1$ , and $\hat{m}_{\text{T}}=m_{\text{DIA}}^{\star}-(\hat{m}_{\text{R}}-1)M_{\text{T}}$ . The large width of a sector beam results in poor angular resolution in DIA. In order to improve the resolution, hierarchical directional beam training scans narrower beams within the sector of interest. Such procedure occurs during CSI-RS bursts which are scheduled for individual UEs.

VI Access latency, overhead and DSP complexity

In this section, we present a model for analyzing three system performance indicators, namely access latency, overhead, and computational complexity. Note that this unified model applies to both directional scheme and the proposed approach.

Based on [7], we propose to use the latency model for both SS burst and CSI-RS as shown in Fig. 3. In both IA schemes, the failure of cell discovery introduces penalty of $T_{\text{F}}$ for a new IA block. When cell discovery occurs, the additional latency is required for scheduled CSI-RS according to the required number $N_{\text{train}}$ . Thus the access latency is

[TABLE]

where the first term includes latency for cell discovery. In the second term, $\tilde{T}_{\text{R}}$ is the average time for the UE to get the scheduled CSI-RS for beam training and it is expressed as

[TABLE]

In the above equation, $N_{\text{U}}$ denotes the number of UEs in the network. They share available CSI-RS in a time division manner to combat channel dynamic. Due to the limited number of CSI-RS $K_{\text{R}}=\lfloor(T_{\text{F}}-MT_{\text{B}})/T_{\text{R}}\rfloor$ within one IA period, more than one frame duration is required to meet scheduling of large number of UE $N_{\text{U}}$ . Therefore, in (29) $K_{\text{F}}=\lfloor(N_{\text{U}}-1)/K_{\text{R}}\rfloor$ is the number of frames required to assign all CSI-RS to UEs and $K_{\text{res}}=N_{\text{U}}-K_{\text{cyc}}K_{\text{R}}$ is the residual delay in the last frame. As shown in the next section, DIA and directional BF training typically require larger $N_{\text{train}}$ than the proposed approach.

Following [7], the overhead (OH) ratio is modeled by counting the time-frequency resource in IA and CSI-RS

[TABLE]

where $B_{\text{IA}}=1/T_{\text{s}}$ is the bandwidth in IA and the channel usage is $MT_{\text{s}}$ every period $T_{\text{F}}$ . We focus on varying CSI-RS density $K_{\text{R}}$ . Note that with reduced $K_{\text{R}}$ (increased $T_{\text{R}}$ ), the OH reduces with a cost of additional latency.

The required baseband operations of the proposed approach are summarized in Table II, where only the complex multiplications are taken into account. As shown in later sections, the off-grid refinement in the proposed algorithm provides CRLB reaching accuracy but is necessary to reach adequate beam alignment. As a consequence, we do not include its complexity here. It is worth noting that the above analysis assumes there is an off-line pre-computation of all required dictionaries for matching pursuit, i.e., $\mathbf{p}_{q}$ in (19), $\tilde{\mathbf{a}}_{k}$ in (22). In addition, the directional IA requires the first two steps in Table II.

VII Results

This section presents the numerical comparison between the proposed approach and DIA with directional beam training.

VII-A Simulation settings

The simulations follow 5G-NR frame structure. We first evaluate performance in the simplified 2D S-V channel model. The maximum excess delay is set as $N_{\text{c}}=4$ samples. As for the benchmark DIA, we use two approaches to design directional sector beams, i.e., least-squares based sector beamforming (LS-Sec) codebook [14] and frequency sampling method based sector beamforming (FSM-Sec) codebook [34, C23.4]. Examples of beam patterns444We uses an optimistic benchmark system where sector beams are synthesized by arrays with ideal phase and magnitude control. are shown in Fig. 4. In each of the Monte Carlo simulations, we generate an independent random realization of pseudorandom sounding beam codebook and channel parameters, unless otherwise mentioned.

We also evaluate the efficacy of the proposed approach in a realistic 3D mmW environment. In Section VII-C we simulate the system with QuaDRiGa simulator [39] based on mmMAGIC model [33] in $28$ GHz urban-micro (UMi). We remove Assumptions 1, 2 from Sections III-A and V-A. Uniform planar arrays (UPA) $N_{\text{T}}=16\times 4$ , $N_{\text{R}}=4\times 4$ are used at BS and UE, respectively, to exploit the higher sparsity in the elevation plane. The proposed algorithm is extended to UPA and 3D mmW channel model accordingly. In the simulations, the transmit power is set to $P_{\text{out}}=46$ dBm. The large scale channel model includes pathloss and shadowing. The AWGN on the receiver with 4dB noise figure is added with power of $-170+10\log_{10}(\text{BW})$ dBm, where $\text{BW}=1/T_{\text{s}}$ and $\text{BW}=B_{\text{tot}}=400$ MHz [40] for IA and data stage, respectively. Moreover, the UE phase noise, $\psi[n]$ in (1), is modeled as Weiner process [41] that corresponds to oscillator with phase noise spectrum $-114$ dBc/Hz at $1$ MHz offset [42]. The frame structure remains as described previously and other detailed simulations setting in QuaDRiGa can be found in the supplementary material [43]. The benchmark approaches are also extended for UPA and 3D channel, i.e., FSM-Sec beams are extended in both azimuth and elevation plane. During each one of $N_{\text{train}}$ CSI-RS, BS and UE use 16 sounding beams pairs which bisect previous scanned azimuth and elevation angular regions. We use post-training SNR as performance indicator, which is evaluated by dividing channel gain $P_{\text{out}}\sum_{d=0}^{N_{\text{c}}}|(\mathbf{w}^{\star})^{\text{H}}\mathbf{H}[d]\mathbf{v}^{\star}|^{2}$ over noise power in $B_{\text{tot}}$ .

Unless otherwise mentioned, the simulation parameters are summarized in Table. III.

VII-B Performance in simplified S-V channel model

The miss detection rate555Miss detection rate in simulation is evaluated by a generalized definition $\mathrm{Pr}(\gamma_{\text{NT}}>\eta_{\text{NT}},\hat{\epsilon_{\text{T}}}=\epsilon_{\text{T}}|\mathcal{H}_{1})$ in this proposed approach when $\epsilon_{\text{T}}\neq 0$ . of the proposed approach for initial discovery is shown in Fig. 5, and it is verified against the theoretical expressions (9). We have the following findings. Firstly, the lack of perfect timing synchronization introduces around 3 dB sensitivity loss as shown between the blue circled curve and red solid curve. However, this issue is unavoidable in practical systems. Secondly, less than 3 dB sensitivity loss occur when $\pm$ 5 ppm CFO is present in addition to STO, as shown by the light blue dashed and green dashed-and-dotted curves. Finally, the practical STO ( $\leq 10\mu$ s) is noncritical as shown by red solid and blue dashed curves. But when STO is large enough to cause transmitter and receiver burst beamforming window mismatch, e.g., 17 $\mu$ s STO which corresponds to large $K(\epsilon_{\text{T}})$ in (11), severe sensitivity loss is introduced as shown in grey dotted curves. In summary, these simulations verified the findings from Section IV that practical initial synchronization error introduces up to few dB sensitivity loss as compared to perfect synchronization scenario.

The comparison among proposed approach and benchmark DIA based discovery approaches is also presented in Fig. 5. Although common sense may doubt the efficacy of the proposed approach since there is no significant angular gain for any beam pattern, as illustrated in Fig. 4, the results show that there is only a couple of dB difference among the proposed approach and benchmark. However, such gap is less than the performance fluctuation of DIA with difference codebooks. The rationale behind this result is that the proposed scheme collects signal energy spread over all $M$ SS bursts which in fact gives equivalent energy measurement as directional approach where energy collection occurs only when a sector beam aligns with true propagation direction.

The beam training performance of the proposed BF training algorithm in LOS is presented in Fig. 6. The performance metrics are the residual mean square error defined by $\text{RMSE}_{\text{AoA}}=\sqrt{\mathbb{E}|\hat{\phi}_{1}-\phi_{1}|^{2}}$ and $\text{RMSE}_{\text{AoD}}=\sqrt{\mathbb{E}|\hat{\theta}_{1}-\theta_{1}|^{2}}$ . The simulations are conducted with Assumption 2. The same pseudorandom setting is used in both simulation and theoretical CRLB evaluation. The refinement steps are forced to terminate in up to 100 iterations. We have the following findings. Firstly, when the off-grid refinement are used, the proposed algorithm reaches CRLB in high SNR regime. Secondly, the coarse estimation in high SNR has a compromised performance as compared to CRLB. However coarse estimation (without refinement) has adequate accuracy for beam steering since RMSE is order of magnitude lower than $3$ dB beam-width in steering, i.e., $0.29\pi/N_{\text{T}}$ and $0.29\pi/N_{\text{R}}$ . Finally, Fig. 5 and 6 reveal that in SNR region between $-15$ dB and $-7.5$ dB reliable detection occurs but beam training performance is poor. Admittedly, this implies a compromised experience for UEs at the cell edge, which is worth further investigation.

VII-C Performance in QuaDRiGa channel simulator

Fig. 7 (a) illustrates the network setting implemented in QuaDRiGa. We simulate the performance of typical UEs distributed in two planes, with different distance towards the pico cell mmW BS. We present the following findings based on Fig. 7 (b), which shows the cumulative distribution function (CDF) of post-training beam steering SNR. Firstly, the proposed approach provides comparable performance to DIA with $N_{\text{train}}=2$ CSI-RS. In fact, in LOS, both approaches closely achieve beam steering towards true LOS path. Although the SNR seems excessively high in LOS, this implies that the transmit power can be reduced to save power. Secondly, DIA with less than $N_{\text{train}}=2$ CSI-RS has compromised SNR performance. This drawback is intuitive because wide sounding sector beam fails to extract precise angle information. The SNR improvement of using higher $N_{\text{train}}$ is more significant in LOS. Thirdly, although the proposed approach is tailored for sparse channels and presence of phase measurement error due to CFO, it is robust in NLOS scenarios where channel sparsity is compromised and practical phase noise occurs. Admittedly, the algorithm has a certain chance to completely fail when NLOS UEs are distributed in the second plane. However, in these cases the counterparts based on DIA and CSI-RS training cannot do much better job either. In fact, they have lower probability to reach post-training beam steering SNR above [math] dB compared to the proposed approach.

The overhead and initial access latency savings of the proposed approach are significant, since it does not require CSI-RS, as shown in Fig. 7 (c). As explained in Section VI, for DIA based approaches when number of UEs in the network increases, the latency increases dramatically due to CSI-RS scheduling. Increasing the density of CSI-RS effectively reduces latency, but it results in increased overhead. The proposed approach relies on advanced signal processing to digitally conduct beam training and avoids requesting CSI-RS after initial access. In summary, up to two order of magnitudes saving in initial access latency is reached as compared to DIA.

VII-D Baseband processing requirements

Using the simulation parameters in Table III to evaluate required operations in Table II, the baseband resource of the proposed method are in the same order of magnitude with DIA, i.e., $(PN_{\text{B}}+PG_{\text{d}}+3MG_{\text{T}}G_{\text{R}})/(PN_{\text{B}})\approx 7.2$ . There are two reasons for this finding. Firstly, exhaustive PSS correlation filter (3) is extreme computational demanding in IA. This filter666In fact, $N_{\text{PSS}}$ filters are required for cell ID identification purpose. is required by IA regardless of sounding beam design. Secondly, the proposed approach sequentially estimates parameters and avoids multi-dimensional grid search.

VIII Discussion on Open issues

In this section, we discuss relevant issues in practical implementation of compressive IA and beam training.

Required a priori knowledge: Firstly, this work assumes coarse timing is available. It would be also important to study the case when timing is completely unknown, i.e., there is no a priori information about the range of $\epsilon_{\text{T}}$ in (1), which could cause SS burst index misalignment to occur. Secondly, the compressive approach requires precise information about the sounding beam pattern $\tilde{\mathbf{a}}_{k}$ in (22). As a results, array geometry and sounding codebooks of both BS and UE need to be known a priori. This raises new challenges in communication protocol design to effectively incorporate this information. It also requires an increase in baseband operations if all dictionaries need to be computed on-the-fly. Further, mmW testbed experiments in [45] showed that the measured beam patterns commonly have mismatches with patterns predicted by codebook and array geometry model. Future research should address these impairments.

Channel sparsity: The efficacy of compressive approach is affected by the sparsity level in AoAs, AoDs, and multipath delays. Sparsity is endorsed by various mmW channel measurement campaigns, and urban NLOS, which is known with infavorable sparsity, is tested in this work. However, severely rich scattering situation are modeled from standard perspective, e,g, there are up to $L=20$ multipath clusters in the 3GPP specified mmW channel [46]. It is important for system that utilize CS-based approach to flexibly handle situation when channel sparsity disappears.

Array architecture: This work focuses on the scenario where UE uses a single RF-chain to process a single stream of IA signals. This allows other RF-chains, if available at BS or UE, to operate in the band of data communication during IA. Since [6] shows that the hybrid analog/digital array and fully digital array are advantageous for DIA, it would be interesting to investigate benefits of compressive IA and beam training algorithm when they are adapted to utilize multiple RF-chain.

MIMO Multiplexing: The proposed beam training is compatible with multiuser multiplexing for hybrid array architecture. In fact, multiplexing designs [47, 23] rely on each RF-chain and corresponding analog beamformer to provide adequate post-BF SNR, and use the digital baseband processing to handle multi-beam interference. However, as mentioned in Remark 1, the comparison with channel estimation based approaches, i.e., estimation of the entire wideband channel or its covariance during CSI-RS for optimal MIMO processing, is rarely investigated.

Phase coherency: To date, there is no coherent CS-based beam training prototype reported in mmW band. The only notable prototype [48] operates at 8GHz with two phased arrays synchronized by cabled reference clock. In addition to CFO, as emphasized in this work, the phase noise can also severely degrade coherency among channel observations. The phase noise detrimental impact becomes more severe with increased carrier frequency. Proper phase noise compensation as well as non-coherent CS-based beam training [26, 28, 27] are naturally immune to phase error and are worth investigation.

IX Conclusions

In this work, quasi-omni pseudorandom sounding beam is proposed for the mmW initial access, synchronization, and beam training. We design associated signal processing algorithm based on the proposed sounding beam structure that is compatible with 5G-NR frame format. We provide theoretical analysis of cell discovery rate and beam training performance, and evaluate them via simulations using the mmW hardware and urban channel models from the literature that are supported by measurements. The results showcase that the proposed approach provides comparable performance to the state-of-the-art directional cell search for initial discovery, but provides significantly more accurate angle estimation during initial beam training. This advantage holds true across different propagation condition (LOS/NLOS) and UE-BS distance at 28 GHz band. Due to the saving of additional radio resource (CSI-RS) for beam refinement, the proposed approach reduces up to two order of magnitude access latency compared to the directional initial access when the same signaling overhead and post-training beam steering SNR are targeted.

All numerical results are reproducible with scripts in [43].

-A Initial discovery performance

The noise after correlation $\tilde{z}[n]=\frac{1}{P}\sum_{k=0}^{P-1}(\mathbf{w}^{\text{H}}[n+k]\mathbf{z}[n+k])s_{\text{zc}}^{*}[k]$ is $\mathcal{NC}(0,\sigma_{\text{n}}^{2}/P)$ . Thus $|\tilde{z}[n]|^{2}$ is Chi-Square distributed with degree-of-freedom 2, mean $\sigma_{\text{n}}^{2}/P$ , and variance $\sigma_{\text{n}}^{4}/P^{2}$ . We denote detection statistic in PT and NT scenario under $\mathcal{H}_{0}$ and $\mathcal{H}_{1}$ as denoted as $\gamma_{\text{PT},0}$ , $\gamma_{\text{PT},1}$ , $\gamma_{\text{NT},0}$ , $\gamma_{\text{NT},1}$ , respectively, and find their distribution.

$\gamma_{\text{PT},0}$ is the sum of squared $N_{\text{c}}M$ realizations of $\tilde{z}[n]$ divided by $M$ , thus central limit theory (CLT) applies. The distribution of $\gamma_{\text{PT},0}$ is $\mathcal{N}(\mu_{\text{PT},0},\sigma_{\text{PT},0})$ , where $\mu_{\text{PT},0}=N_{\text{c}}\sigma_{\text{n}}^{2}/P$ and $\sigma_{\text{PT},0}=\sqrt{N_{\text{c}}\sigma_{\text{n}}^{4}/(P^{2}M)}$ , respectively. As a result, the optimal detection threshold that reaches target false alarm rate $P^{\star}_{\text{FA}}$ is given by (7). Similarly, the detection statistic under $\mathcal{H}_{0}$ with TO is denoted as $\gamma_{\text{NT},0}$ . It is the maximum operation with degrees of freedom $\epsilon_{\text{T,max}}$ of $\gamma_{\text{PT},0}$ . With large $\epsilon_{\text{T,max}}$ , $\gamma_{\text{NT},0}$ follows extreme value distribution, Gumbel Distribution, where the mean and standard deviation are $\mu_{\text{NT},0}=\mu_{\text{PT},0}+\sigma_{\text{PT},0}\mathrm{Q}^{-1}\left(1/\epsilon_{\text{T,max}}\right)\text{ and }$ and $\sigma_{\text{NT},0}=\sigma_{\text{PT},0}/\mathrm{Q}^{-1}\left(1/\epsilon_{\text{T,max}}\right)$ , respectively. Using its inverse cumulative distribution function, the optimal detection threshold is $\eta^{\star}_{\text{NT}}=\mu_{\text{NT},0}-(\sqrt{6}\pi)\sigma_{\text{NT},0}\ln\left(-\ln\left(1-P^{\star}_{\text{FA}}\right)\right)$ . It gives (7) using expressions of $\mu_{\text{NT},0}$ , $\sigma_{\text{NT},0}$ and $\sqrt{6}/\pi\approx 0.78$ .

Detection statistic $\gamma_{\text{PT},1}$ is the sum of noise energy and signal energy, i.e., $\gamma_{\text{PT},1}=\gamma_{\text{PT},0}+(\sum_{m=1}^{M}\sum_{l=0}^{L}|\tilde{g}_{m,l}\sum_{n=1}^{P}|s_{\text{zc}}[n]|^{2}e^{j\epsilon_{\text{F}}n}|^{2})/(PMN_{\text{T}}N_{\text{R}}),$ where $\tilde{g}_{m,l}$ is defined in Section V-A. Using the fact $|s_{\text{zc}}[n]|=1$ , definition $\kappa(0,\epsilon_{\text{F}})\triangleq|\sum_{n=1}^{P}e^{j\epsilon_{\text{F}}n}|^{2}$ in (10), and approximation that different multipaths are resolvable, i.e., $p_{\text{c}}(dT_{\text{s}}-\tau_{l})=1,d\in\mathcal{S}_{\text{d}}$ where $\mathcal{S}_{\text{d}}$ has $L$ integers in range $[0,N_{\text{c}}-1]$ , the above equation becomes $\gamma_{\text{PT},1}=\kappa(0,\epsilon_{\text{F}})\sum_{m=1}^{M}\zeta_{m}/M+\gamma_{\text{PT},0}$ where $\zeta_{m}=\sum_{l=0}^{L}|g_{l}\mathbf{w}^{\text{H}}_{m}\mathbf{a}_{\text{R}}(\phi_{l})\mathbf{a}^{\text{H}}_{\text{T}}(\theta_{l})\mathbf{v}_{m}|^{2}/(N_{\text{T}}N_{\text{R}})$ . Using the fact that $\zeta_{m}$ are mutually independent due to independent $\mathbf{v}_{m}$ and $\mathbf{w}_{m}$ , the mean and variance of $\zeta_{m}$ are $\mathbb{E}(\zeta_{m})=\sum_{l=1}^{L}\left|g_{l}\right|^{2}\mathbb{E}\left|\mathbf{w}^{\text{H}}_{m}\mathbf{a}_{\text{R}}(\phi_{l})\right|^{2}\mathbb{E}\left|\mathbf{a}^{\text{H}}_{\text{T}}(\theta_{l})\mathbf{v}_{m}\right|^{2}/(N_{\text{T}}N_{\text{R}})=\sigma_{\text{g}}^{2}$ and $\mathrm{var}(\zeta_{m})=(N_{\text{T}}N_{\text{R}})^{-2}\sum_{l=1}^{L}\left|g_{l}\right|^{4}\mathbb{E}\left|\mathbf{w}^{\text{H}}_{m}\mathbf{a}_{\text{R}}(\phi_{l})\right|^{4}\mathbb{E}\left|\mathbf{a}^{\text{H}}_{\text{T}}(\theta_{l})\mathbf{v}_{m}\right|^{4}$ $-\sigma_{\text{g}}^{4}=\sigma_{\text{g}}^{4}\left(2-\frac{1}{N_{\text{T}}}\right)\left(2-\frac{1}{N_{\text{R}}}\right)-\sigma_{\text{g}}^{4}\approx 3\sigma_{\text{g}}^{4}$ , respectively. The above approximation holds true with typical antenna array sizes $N_{\text{R}}$ and $N_{\text{T}}$ in mmW. Therefore, according to CLT $\gamma_{\text{PT},1}\sim\mathcal{CN}(\kappa(0,\epsilon_{\text{F}})\sigma_{\text{g}}^{2}+\mu_{\text{PT},0},3\kappa^{2}(0,\epsilon_{\text{F}})\sigma_{\text{g}}^{4}/M+\sigma^{2}_{\text{PT},0})$ , which gives the miss detection probability $P_{\text{MD,PT}}=\mathrm{Q}[(\mathbb{E}(\gamma_{\text{PT},1})-\eta^{\star}_{\text{PT}})/\sqrt{\mathrm{var}(\gamma_{\text{PT},1})}]$ , and it equals to (9).

In NT scenario, we make the following approximations: 1) the detection statistic $\gamma_{\text{NT},1}$ corresponds to the correlation peaks for the correct timing $\epsilon_{\text{T}}$ ; 2) the abrupt beamformer changes during $m$ -th PSS reception, when present, result in an independent realization of sounding beam $\tilde{\mathbf{w}}_{m}$ . Although the former is not valid with low SNR, the MD rate with typical threshold in such SNR regime already approaches 1. Therefore, impact of such loose approximation is negligible. Based on these assumptions, we evaluate distribution of $\gamma_{\text{NT},1}$ as $\gamma_{\text{NT},1}=\gamma_{\text{PT},0}+\frac{1}{PMLN_{\text{T}}N_{\text{R}}}(\sum_{m=1}^{M}\sum_{l=0}^{L}|\tilde{g}^{(1)}_{m,l}\sum_{n_{1}=1}^{K-1}|s_{\text{zc}}[n_{1}]|^{2}e^{j\epsilon_{\text{F}}n_{1}}+\tilde{g}^{(2)}_{m,l}\sum_{n_{2}=K}^{P}|s_{\text{zc}}[n_{2}]|^{2}e^{j\epsilon_{\text{F}}n_{2}}|^{2})$ where $\tilde{g}^{(1)}_{m,l}=g_{l}\mathbf{w}^{\text{H}}_{m}\mathbf{a}_{\text{R}}(\phi_{l})\mathbf{a}^{\text{H}}_{\text{T}}(\theta_{l})\mathbf{v}_{m}$ and $\tilde{g}^{(2)}_{m,l}=g_{l}\tilde{\mathbf{w}}^{\text{H}}_{m}\mathbf{a}_{\text{R}}(\phi_{l})\mathbf{a}^{\text{H}}_{\text{T}}(\theta_{l})\mathbf{v}_{m}$ are the post-BF channel gain due to partially overlapped burst window in BS and UE. In other word, $K$ follows (11) and $n_{1}\in[1,K-1]$ and $n_{2}\in[K,P]$ are the sample window where $K$ represents the abrupt change in BF. The independent $\mathbf{w}_{m}$ and $\tilde{\mathbf{w}}_{m}$ lead to uncorrelated $\tilde{g}^{(1)}_{m,l}$ and $\tilde{g}^{(2)}_{m,l}$ . For notational convenience of finding statistic of $\gamma_{\text{NT},1}$ , we define $\zeta_{m,l}$ as $\zeta_{m,l}\triangleq(|\tilde{g}^{(1)}_{m,l}\frac{1-e^{jK\epsilon_{\text{F}}}}{1-e^{j\epsilon_{\text{F}}}}+\tilde{g}^{(2)}_{m,l}\frac{1-e^{j(P-K)\epsilon_{\text{F}}}}{1-e^{j\epsilon_{\text{F}}}}|^{2})/(N_{\text{T}}N_{\text{R}})$ in $\gamma_{\text{NT},1}$ after simplification with the fact $|s_{\text{zc}}[n]|^{2}=1,\forall n\in\mathcal{S}$ as well as $\sum_{n=1}^{K}e^{j\epsilon_{\text{T}}n}=(1-e^{jK\epsilon_{\text{F}}})/(1-e^{j\epsilon_{\text{F}}})$ . The mean and variance of $\zeta_{m,l}$ are $\mathbb{E}\left(\zeta_{m,l}\right)=\kappa(\epsilon_{\text{F}},\epsilon_{\text{T}})\sigma_{\text{g}}^{2}$ , and $\mathrm{var}\left(\zeta_{m,l}\right)\approx 3\sigma^{4}_{\text{g}}\zeta^{2}(\epsilon_{\text{F}},\epsilon_{\text{T}})$ after plugging in definition of $\kappa(\epsilon_{\text{F}},\epsilon_{\text{T}})$ from (10). Using CLT and statistic of $\zeta_{m,l}$ , $\gamma_{\text{NT},1}\sim\mathcal{CN}(\mu_{\text{PT},0}+\kappa(\epsilon_{\text{F}},\epsilon_{\text{T}})\sigma^{2}_{\text{g}},\sigma^{2}_{\text{PT},0}+3\sigma^{4}_{\text{g}}\kappa^{2}(\epsilon_{\text{F}},\epsilon_{\text{T}})/M$ . The MD rate $P_{\text{MD,NT}}=\mathrm{Q}[(\mathbb{E}(\gamma_{\text{NT},1})-\eta^{\star}_{\text{NT}})/\sqrt{\mathrm{var}(\gamma_{\text{NT},1})}]$ reduces to (9).

-B CRLB of joint estimation problem

The FIM has the following form

[TABLE]

where $\Phi_{x,x}$ denotes for $\Phi_{x,x}=\partial^{2}L(\mathbf{y};\boldsymbol{\xi})/\partial x\partial y=(\partial L(\mathbf{x}(\boldsymbol{\xi})/\partial x)^{\text{H}}(\partial L(\mathbf{x}(\boldsymbol{\xi}))/\partial y)$ . The exact expressions of each elements in FIM are summarized in Table IV, where for notational convenience the following matrices are defined. The derivative over CFO matrix is a diagonal matrix whose $p$ -th diagonal element is $[\dot{\mathbf{Q}}_{m}]_{p,p}=j[(m-1)N_{\text{B}}+(p-1)]e^{j\epsilon_{\text{F}}[(m-1)N_{\text{B}}+(p-1)]}$ . The vector $\dot{\mathbf{f}}=\partial\mathbf{f}(\tau)/\partial\tau$ whose $p$ -th element is $[\dot{\mathbf{f}}]_{p}=j2\pi(p-1)T_{\text{s}}e^{j2\pi(p-1)\epsilon_{\text{F}}T_{\text{s}}}$ Other expression in Table IV include $\mathbf{f}^{\text{H}}(\tau)\mathbf{F}^{\text{H}}\mathbf{Q}_{m}^{\text{H}}\mathbf{Q}_{m}\mathbf{F}\mathbf{f}(\tau)=P,\forall m$ , $C_{\text{df}}=\sum_{p=0}^{P-1}2\pi pT_{\text{s}}=(P-2)(P-1)\pi T_{\text{s}}$ , $C_{\text{dq},m}\triangleq{\mathbf{f}}^{\text{H}}(\tau)\mathbf{F}^{\text{H}}\dot{\mathbf{Q}}_{m}^{\text{H}}\mathbf{Q}_{m}\mathbf{F}\mathbf{f}(\tau)=(m-1)T_{\text{B}}+\frac{(P-2)(P-1)T_{\text{s}}}{2},$ , and $C_{\text{d2q},m}=\sum_{p=0}^{P-1}\left[(m-1)T_{\text{B}}+pT_{\text{s}}\right]^{2}$ .

Bibliography48

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] H. Yan and D. Cabric, “Compressive sensing based initial beamforming training for massive MIMO millimeter-wave systems,” in 2016 IEEE Global Conference on Signal and Information Processing (Global SIP) , Dec. 2016, pp. 620–624.
2[2] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. K. Soong, and J. C. Zhang, “What will 5G be?” IEEE J. Sel. Areas Commun. , vol. 32, no. 6, pp. 1065–1082, Jun. 2014.
3[3] T. S. Rappaport, S. Sun, R. Mayzus, H. Zhao, Y. Azar, K. Wang, G. N. Wong, J. K. Schulz, M. Samimi, and F. Gutierrez, “Millimeter wave mobile communications for 5G cellular: It will work!” IEEE Access , vol. 1, pp. 335–349, May 2013.
4[4] R. W. Heath, N. González-Prelcic, S. Rangan, W. Roh, and A. M. Sayeed, “An overview of signal processing techniques for millimeter wave MIMO systems,” IEEE J. Sel. Topics Signal Process. , vol. 10, no. 3, pp. 436–453, Apr. 2016.
5[5] 3GPP, “TR 38.802 study on new radio access technology physical layer aspects,” 2017.
6[6] C. N. Barati, S. A. Hosseini, S. Rangan, P. Liu, T. Korakis, S. S. Panwar, and T. S. Rappaport, “Directional cell discovery in millimeter wave cellular networks,” IEEE Trans. Wireless Commun. , vol. 14, no. 12, pp. 6664–6678, Dec. 2015.
7[7] M. Giordani, M. Polese, A. Roy, D. Castor, and M. Zorzi, “A tutorial on beam management for 3GPP NR at mm Wave frequencies,” Co RR , vol. abs/1804.01908, 2018. [Online]. Available: http://arxiv.org/abs/1804.01908
8[8] Y. Li, J. Luo, M. H. C. Garcia, R. Böhnke, R. A. Stirling-Gallacher, W. Xu, and G. Caire, “On the beamformed broadcasting for millimeter wave cell discovery: Performance analysis and design insight,” vol. 17, no. 11, pp. 7620–7634, Nov. 2018.