Deep Unfolding Hybrid Beamforming Designs for THz Massive MIMO Systems

Nhan Thanh Nguyen; Mengyuan Ma; Nir Shlezinger; Yonina C. Eldar; A. L.; Swindlehurst; Markku Juntti

arXiv:2302.12041·cs.IT·February 24, 2023

Deep Unfolding Hybrid Beamforming Designs for THz Massive MIMO Systems

Nhan Thanh Nguyen, Mengyuan Ma, Nir Shlezinger, Yonina C. Eldar, A. L., Swindlehurst, Markku Juntti

PDF

Open Access 1 Repo

TL;DR

This paper introduces a deep unfolding framework with neural networks for hybrid beamforming in THz massive MIMO systems, achieving high spectral efficiency with low complexity and fast computation.

Contribution

It develops a novel deep unfolding approach using ManNet and subManNet for efficient hybrid beamforming design in THz MIMO systems, outperforming traditional methods.

Findings

01

Outperforms conventional model-based and deep unfolded methods.

02

Achieves over 1000 times faster computation than Riemannian manifold scheme.

03

Reduces complexity by more than a factor of six.

Abstract

Hybrid beamforming (HBF) is a key enabler for wideband terahertz (THz) massive multiple-input multiple-output (mMIMO) communications systems. A core challenge with designing HBF systems stems from the fact their application often involves a non-convex, highly complex optimization of large dimensions. In this paper, we propose HBF schemes that leverage data to enable efficient designs for both the fully-connected HBF (FC-HBF) and dynamic sub-connected HBF (SC-HBF) architectures. We develop a deep unfolding framework based on factorizing the optimal fully digital beamformer into analog and digital terms and formulating two corresponding equivalent least squares (LS) problems. Then, the digital beamformer is obtained via a closed-form LS solution, while the analog beamformer is obtained via ManNet, a lightweight sparsely-connected deep neural network based on unfolding projected gradient…

Tables1

Table 1. Table I: Computational complexity of ManNet/subManNet based FC-HBF/SC-HBF compared with MO-AltMin, AO, OMP, and SDR-AltMin.

Structure	Schemes	Overall complexity
FC-HBF	ManNet	$ℐ_{net} 𝒪 (N_{t} K + N_{t} + K)$
	MO-AltMin	$ℐ_{MO}^{out} 𝒪 (N_{t} K) + ℐ_{MO}^{out} ℐ_{MO}^{in} 𝒪 (N_{t} + K)$
	AO	$𝒪 (N_{t} K) + ℐ_{AO} 𝒪 (N_{t}^{2})$
	OMP	$𝒪 (N_{t} K + N_{t})$
SC-HBF	ManNet	$ℐ_{net} 𝒪 (N_{t} K + N_{t} + K) + 2 \| \tilde{𝒦} \| 𝒪 (N_{t})$
	subManNet	$ℐ_{net} 𝒪 (N_{t} K + N_{t} + K)$
	SDR-AltMin	$ℐ_{SDR} 𝒪 (N_{t} K + K)$

Equations101

y [k]

y [k]

H [k] = ξ p = 1 \sum P α_{p} e^{- j 2 π τ_{p} f_{k}} a_{r} (θ_{p}^{r}, ϕ_{p}^{r}, f_{k}) a_{t} (θ_{p}^{t}, ϕ_{p}^{t}, f_{k})^{H} .

H [k] = ξ p = 1 \sum P α_{p} e^{- j 2 π τ_{p} f_{k}} a_{r} (θ_{p}^{r}, ϕ_{p}^{r}, f_{k}) a_{t} (θ_{p}^{t}, ϕ_{p}^{t}, f_{k})^{H} .

\displaystyle{\mathbf{a}}_{\text{t}}(\theta^{\text{t}}_{p},\phi^{\text{t}}_{p},f_{k})=\frac{1}{\sqrt{N_{\text{r}}}}\Big{[}1,\dots,e^{j\pi\frac{f_{k}}{f_{\text{c}}}(i_{\text{h}}\sin(\phi^{\text{t}}_{p})\sin(\theta^{\text{t}}_{p})+i_{\text{v}}\cos(\theta^{\text{t}}_{p}))},

\displaystyle{\mathbf{a}}_{\text{t}}(\theta^{\text{t}}_{p},\phi^{\text{t}}_{p},f_{k})=\frac{1}{\sqrt{N_{\text{r}}}}\Big{[}1,\dots,e^{j\pi\frac{f_{k}}{f_{\text{c}}}(i_{\text{h}}\sin(\phi^{\text{t}}_{p})\sin(\theta^{\text{t}}_{p})+i_{\text{v}}\cos(\theta^{\text{t}}_{p}))},

\displaystyle\qquad\dots,e^{j\pi\frac{f_{k}}{f_{\text{c}}}((N_{\text{r}}^{\text{h}}-1)\sin(\phi^{\text{t}}_{p})\sin(\theta^{\text{t}}_{p})+(N_{\text{r}}^{\text{v}}-1)\cos(\theta^{\text{t}}_{p}))}\Big{]}^{T},

F_{RF} \in A_{full} ≜ {F_{RF} : [F_{RF}]_{m, n} = e^{j ζ_{m, n}}, \forall m, n},

F_{RF} \in A_{full} ≜ {F_{RF} : [F_{RF}]_{m, n} = e^{j ζ_{m, n}}, \forall m, n},

\displaystyle{\mathbf{F}}_{\text{RF}}\in\mathcal{A}_{\text{sub}}\triangleq\Big{\{}{\mathbf{F}}_{\text{RF}}:[{\mathbf{F}}_{\text{RF}}]_{m,n}\in\left\{0,e^{j\zeta_{m,n}}\right\},

\displaystyle{\mathbf{F}}_{\text{RF}}\in\mathcal{A}_{\text{sub}}\triangleq\Big{\{}{\mathbf{F}}_{\text{RF}}:[{\mathbf{F}}_{\text{RF}}]_{m,n}\in\left\{0,e^{j\zeta_{m,n}}\right\},

\displaystyle\quad\sum_{m=1}^{N_{\text{t}}}\left|[{\mathbf{F}}_{\text{RF}}]_{m,n}\right|=M,\sum_{n=1}^{N_{\text{RF}}}\left|[{\mathbf{F}}_{\text{RF}}]_{m,n}\right|=1,\ \forall m,n\Big{\}},

R

R

\displaystyle\hskip 85.35826pt\times{\mathbf{F}}_{\text{BB}}[k]^{H}{\mathbf{F}}_{\text{RF}}^{H}{\mathbf{H}}[k]^{H}{\mathbf{V}}[k]\Big{)}.

F_{RF}, {F_{BB} [k]}_{k = 1}^{K} minimize

F_{RF}, {F_{BB} [k]}_{k = 1}^{K} minimize

F_{RF} \in A,

∥ F_{RF} F_{BB} [k] ∥_{F}^{2} = N_{s}, \forall k,

\displaystyle\quad\underset{\begin{subarray}{c}{\mathbf{F}}_{\text{RF}}\end{subarray}}{\textrm{minimize}}\

\displaystyle\quad\underset{\begin{subarray}{c}{\mathbf{F}}_{\text{RF}}\end{subarray}}{\textrm{minimize}}\

F_{RF} \in A_{full},

\tilde{x}

\tilde{x}

\tilde{z} [k]

\tilde{B} [k]

k = 1 \sum K ∥ F_{opt} [k] - F_{RF} F_{BB} [k] ∥_{F}^{2} = k = 1 \sum K ∥ \tilde{z} [k] - \tilde{B} [k] \tilde{x} ∥^{2} .

k = 1 \sum K ∥ F_{opt} [k] - F_{RF} F_{BB} [k] ∥_{F}^{2} = k = 1 \sum K ∥ \tilde{z} [k] - \tilde{B} [k] \tilde{x} ∥^{2} .

x

x

z [k]

B [k]

k = 1 \sum K ∥ F_{opt} [k] - F_{RF} F_{BB} [k] ∥_{F}^{2} = k = 1 \sum K ∥ z [k] - B [k] x ∥^{2} .

k = 1 \sum K ∥ F_{opt} [k] - F_{RF} F_{BB} [k] ∥_{F}^{2} = k = 1 \sum K ∥ z [k] - B [k] x ∥^{2} .

V : F_{RF} \to x \leavevmode \nobreak and \leavevmode \nobreak V^{- 1} : x \to F_{RF}

V : F_{RF} \to x \leavevmode \nobreak and \leavevmode \nobreak V^{- 1} : x \to F_{RF}

x^{⋆} = argmin_{x : V^{- 1} (x) \in A_{full}} k = 1 \sum K ∥ z [k] - B [k] x ∥^{2} .

x^{⋆} = argmin_{x : V^{- 1} (x) \in A_{full}} k = 1 \sum K ∥ z [k] - B [k] x ∥^{2} .

x_{ℓ}

x_{ℓ}

= T_{ℓ} (x_{ℓ - 1} - k = 1 \sum K (δ_{ℓ} B [k]^{T} z [k] + δ_{ℓ} B [k]^{T} B [k] x_{ℓ - 1}))

= T_{ℓ} (x_{ℓ - 1} - δ_{ℓ} \overset{ˉ}{z} + δ_{ℓ} k = 1 \sum K \overset{ˉ}{B} [k] x_{ℓ - 1}),

u_{ℓ - 1} ≜ - \overset{ˉ}{z} + k = 1 \sum K \overset{ˉ}{B} [k] x_{ℓ - 1},

u_{ℓ - 1} ≜ - \overset{ˉ}{z} + k = 1 \sum K \overset{ˉ}{B} [k] x_{ℓ - 1},

x_{ℓ} = T_{ℓ} (x_{ℓ - 1} + δ_{ℓ} u_{ℓ - 1}) .

x_{ℓ} = T_{ℓ} (x_{ℓ - 1} + δ_{ℓ} u_{ℓ - 1}) .

ψ_{t} (x) = - 1 + \frac{1}{∣ t ∣} (σ (x + t) - σ (x - t)),

ψ_{t} (x) = - 1 + \frac{1}{∣ t ∣} (σ (x + t) - σ (x - t)),

L ({w_{ℓ, 1}, w_{ℓ, 2}}_{ℓ = 1}^{L}) = ℓ = 1 \sum L lo g (ℓ) (k = 1 \sum K ∥ z [k] - B [k] x_{ℓ} ∥^{2}),

L ({w_{ℓ, 1}, w_{ℓ, 2}}_{ℓ = 1}^{L}) = ℓ = 1 \sum L lo g (ℓ) (k = 1 \sum K ∥ z [k] - B [k] x_{ℓ} ∥^{2}),

F_{BB} [k]^{(b, i)} = (F_{RF}^{(b, i)})^{†} F_{opt} [k]^{(b)}, \forall k, b, i,

F_{BB} [k]^{(b, i)} = (F_{RF}^{(b, i)})^{†} F_{opt} [k]^{(b)}, \forall k, b, i,

F_{BB} [k]^{(i)} = (F_{RF}^{(i)})^{†} F_{opt} [k], \forall k, i .

F_{BB} [k]^{(i)} = (F_{RF}^{(i)})^{†} F_{opt} [k], \forall k, i .

(P_{BB}) : {F_{BB} [k]} maximize

(P_{BB}) : {F_{BB} [k]} maximize

trace (Q F_{BB} [k] F_{BB} [k]^{H}) = N_{s}, \forall k,

R_{BB}

R_{BB}

\frac{1}{K} k = 1 \sum K lo g_{2} det (I_{N_{s}} + \frac{ρ}{σ _{n}^{2} N _{s}} \tilde{H} F_{BB} [k] F_{BB} [k]^{H} \tilde{H}^{H}),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

WillysMa/Deep_unfolding_Hybrid_beamforming
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMicrowave Engineering and Waveguides · Millimeter-Wave Propagation and Modeling · Antenna Design and Optimization

Full text

Deep Unfolding Hybrid Beamforming Designs for THz Massive MIMO Systems

Nhan Thanh Nguyen, , Mengyuan Ma, , Ortal Lavi, Nir Shlezinger, , Yonina C. Eldar, , A. L. Swindlehurst, , and Markku Juntti This research was supported by Academy of Finland under 6Genesis Flagship (grant 318927), EERA Project (grant 332362), Infotech Program funded by University of Oulu Graduate School, and U.S. National Science Foundation grant CCF-2225575. The authors wish to acknowledge CSC – IT Center for Science, Finland, for computational resources. A short version of this paper has been submitted to the IEEE Int. Conf. Acoust., Speech, Signal Processing, 2023.N. T. Nguyen, M. Ma, and M. Juntti are with Centre for Wireless Communications, University of Oulu, P.O.Box 4500, FI-90014, Finland (e-mail: {nhan.nguyen, mengyuan.ma, markku.juntti}@oulu.fi). N. Shlezinger and Ortal Lavi are with School of ECE, Ben-Gurion University of the Negev, Beer-Sheva, Israel (email: {nirshl, agivo}@bgu.ac.il). Y. C. Eldar is with Faculty of Math and CS, Weizmann Institute of Science, Rehovot, Israel (email: [email protected]). A. L. Swindlehurst is with Department of EECS, University of California, Irvine, CA, US (email: [email protected]).

Abstract

Hybrid beamforming (HBF) is a key enabler for wideband terahertz (THz) massive multiple-input multiple-output (mMIMO) communications systems. A core challenge with designing HBF systems stems from the fact their application often involves a non-convex, highly complex optimization of large dimensions. In this paper, we propose HBF schemes that leverage data to enable efficient designs for both the fully-connected HBF (FC-HBF) and dynamic sub-connected HBF (SC-HBF) architectures. We develop a deep unfolding framework based on factorizing the optimal fully digital beamformer into analog and digital terms and formulating two corresponding equivalent least squares (LS) problems. Then, the digital beamformer is obtained via a closed-form LS solution, while the analog beamformer is obtained via ManNet, a lightweight sparsely-connected deep neural network based on unfolding projected gradient descent. Incorporating ManNet into the developed deep unfolding framework leads to the ManNet-based FC-HBF scheme. We show that the proposed ManNet can also be applied to SC-HBF designs after determining the connections between the radio frequency chain and antennas. We further develop a simplified version of ManNet, referred to as subManNet, that directly produces the sparse analog precoder for SC-HBF architectures. Both networks are trained with an unsupervised training procedure. Numerical results verify that the proposed ManNet/subManNet-based HBF approaches outperform the conventional model-based and deep unfolded counterparts with very low complexity and a fast run time. For example, in a simulation with $128$ transmit antennas, it attains a slightly higher spectral efficiency than the Riemannian manifold scheme, but over $1000$ times faster and with a complexity reduction of more than by a factor of six (6).

Index Terms:

THz communications, hybrid beamforming, massive MIMO, deep learning, AI, deep unfolding.

I Introduction

Future sixth-generation (6G) wireless networks are expected to realize Tbps single-user data rates to support emerging ultra-high-speed applications, such as mobile holograms, immersive virtual reality, and digital twins [1]. To realize such rapid growth in data traffic and applications, wideband terahertz (THz) massive multiple-input multiple-output (mMIMO) systems have emerged as key enablers for achieving substantial improvements in the system spectral and energy efficiency (SE/EE) [2]. In THz mMIMO transceivers, hybrid beamforming (HBF) can provide a cost- and energy-efficient solution that yields significant multiplexing gains with a limited number of power-hungry radio frequency (RF) chains [3, 4].

As HBF delegates some of the beamforming operations to the analog domain, its design largely depends on the considered hardware and its associated constraints [5]. A candidate implementation of HBF systems realizes analog beamforming via tunable complex gains and phase shifters [6], which can be efficiently designed using quantized vector modulators [7]. While these architectures are highly flexible, they are expected to be very costly when implemented at high frequencies. Another candidate HBF architecture is based on metasurface antennas [8], whose implementation for mMIMO at high frequencies is still an area of active research. Consequently, the most common mMIMO HBF architecture considered to date realizes analog beamforming using adjustable phase shifters [9]. However, optimizing a phase-shifter-based HBF is challenging due to the need for constant modulus constraints on the analog beamforming coefficients and the strong coupling between the analog and digital beamformers. Thus, efficient HBF methods overcoming these challenges have attracted much interest in the literature, with proposed approaches ranging from conventional model-based optimizations to purely data-driven deep learning (DL).

I-A Related Works

HBF designs and optimization usually require cumbersome algorithms such as Riemannian manifold minimization (MO-AltMin) [10] and alternating optimization (AO) [11]. In MO-AltMin, the alternating analog and digital beamformer designs form a nested loop procedure, wherein the former is solved by Riemannian manifold optimization, and the latter is obtained via a least squares (LS) problem. With $N_{\text{t}}$ antennas and $N_{\text{RF}}$ RF chains, AO solves for each of $N_{\text{t}}N_{\text{RF}}$ analog beamforming coefficients in an alternating manner until convergence. Although MO-AltMin and AO offer satisfactory performance, both require nested loops with high complexity and slow convergence, especially for large mMIMO systems. A low-complexity alternative for HBF designs is the orthogonal matching pursuit (OMP) approach [12]. It requires only $N_{\text{RF}}$ iterations to select $N_{\text{RF}}$ analog precoding vectors from a codebook consisting of array response vectors. However, the performance of OMP is usually significantly inferior to the optimum.

While MO-AltMin works for both narrowband and wideband scenarios, the original AO and OMP approaches only apply to narrowband systems. Lee et al. [13] further optimized OMP for orthogonal frequency-division multiplexing (OFDM)-based MIMO systems. In [14], a variant of AO was proposed for wideband MIMO-OFDM systems. It is shown that an analog combiner designed only for the center frequency and optimal frequency-dependent digital combiners can achieve near-optimal performance as long as the bandwidth is narrow or the array’s dimensions are small enough so that the array response remains approximately frequency-non-selective. When the array response becomes frequency-selective or suffers from the so-called beam squint effect [14] in wideband THz systems, it can be mitigated by employing true-time-delay (TTD) lines in the analog beamforming architecture [4, 15, 16]. However, the deployment of TTDs requires additional hardware complexity and power consumption. Yuan et al. [17] proposed a wideband HBF scheme with two digital beamformers, in which an additional digital beamformer is introduced to compensate for the performance loss caused by the constant-amplitude hardware constraints and channel non-uniformity across the subcarriers. Li et al. [18] considered an HBF architecture with dynamic antenna subarrays and low-resolution phase shifters and address the HBF design with classical block coordinate descent.

Recently, the application of DL to wireless communications problems has attracted significant attention [19, 20, 21, 22], with one of the considered problems being HBF design [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33]. Two typical DL techniques are often applied: purely data-driven DL and hybrid model-based DL [34]. The former relies mainly on the learning capability of deep neural networks (DNNs) [23, 24, 25], convolutional neural networks (CNNs) [26, 27, 28, 29, 35, 36, 37, 38], or deep reinforcement learning [39, 40] to generate HBF beamformers. For example, [38] designed a mMIMO HBF with a group-of-subarray structure in the low-THz band via both model-based AO and data-driven CNNs. It was shown that while the former can achieve better performance, the latter operates approximately $500$ times faster than the model-based AO. Yet, such a purely data-driven DL approach has major limitations due to its resource constraints, high complexity, and black-box nature [20, 41, 42, 43, 44].

Model-based DL encompasses a family of hybrid methodologies for combining domain knowledge with data to realize efficient inference mappings [45]. A leading hybrid methodology is deep unfolding, which leverages DL techniques to improve model-based iterative optimizers in terms of convergence, robustness, and performance [46]. In the context of HBF design, Balevi et al. [30] used deep generative unfolding models to obtain near-optimal hybrid beamformers with reduced feedback and complexity. Luo et al. [47] and Shi et al. [32] proposed deep unfolding HBF solutions based on unfolding AO and iterative gradient descent, respectively.

Most of the aforementioned works focused on HBF design in conventional narrowband systems. In wideband MIMO-OFDM systems, the analog beamformer is typically frequency flat, i.e., a common analog beamforming matrix must serve the entire frequency band. This imposes extra difficulties on the HBF design, and the approaches proposed for narrowband systems are not readily applicable. The work [31] proposed a low-complexity HBF design by unfolding the projected gradient ascent (PGA) optimization with a fixed number of iterations and learning the hyperparameters of the iterative optimizer from the data. Chen et al. [33] proposed a DNN architecture that unfolds the weighted minimum mean square error manifold optimization using fully-connected DNNs to learn the step size in each iteration, leading to faster convergence and improved performance. However, high complexity is still required to update the gradient and the solutions in each iteration. Kang et al. [48] introduced a deep unfolding hybrid beamforming design induced by a stochastic successive convex approximation algorithm. This scheme achieves good HBF performance; however, its highly-parameterized DNN network architecture is complicated, and the use of black-box DNNs results in high complexity. In [49], a DNN model referred to as a multi-generator generative adversarial network (MGGAN) was introduced for HBF design with rank-deficient channels. Similar to [48], the MGGAN architecture is highly complex.

I-B Contributions

In this paper, we propose efficient deep unfolding approaches for the designs of both fully-connected HBF (FC-HBF) and dynamic sub-connected HBF (SC-HBF) architectures. The proposed deep unfolding frameworks are based on unrolling iterations of the MO-AltMin algorithm of [10], and they are thus referred to as ManNet-based HBF. The main idea is to first transform the challenging SE maximization problem into an approximate matrix factorization problem, in which both the analog and digital precoders admit LS formulations. In each iteration, the analog beamformers are produced by a DNN, while the digital beamformers are obtained via closed-form LS solutions. Furthermore, the employed DNN has a low-complexity sparsely-connected structure based on unfolding the projected gradient descent (PGD) algorithm. In this sense, the proposed ManNet-based HBF designs are a two-fold deep unfolding procedure. We summarize our main contributions as follows:

•

We propose an unfolding framework for the design of FC-HBF architectures based on unfolding MO-AltMin. Unlike most existing DL-aided FC-HBF designs, the unfolding framework is developed by investigating the matrix factorization problem for HBF design rather than the original SE maximization. Thereby the complicated log-det objective function is transformed into a simpler norm-squared form in which the digital and (vectorized) analog precoders are alternately solved via LS. This significantly simplifies the design and reduces the overall complexity compared to the unfolding methods in [31, 48].

•

Based on the unfolded framework, we develop a lightweight DNN architecture called ManNet to estimate the analog beamformer based on PGD. ManNet is a sparsely connected DNN with an explainable architecture and low-complexity operations. Specifically, it can output reliable analog precoding coefficients with only a few layers, each requiring only element-wise multiplications between the input and weight vectors. We also propose an efficient unsupervised training procedure for ManNet. The training strategy offers fast convergence with limited training data and no training labels.

•

We then focus on dynamic SC-HBF design. The trained ManNet can be readily applied here. Specifically, we propose a low-complexity scheme to establish the dynamic connections between the RF chains and antennas, and the sparse analog precoding matrix is obtained by matching the channel gains with the output of ManNet. To further reduce the complexity of the SC-HBF design, we develop a simplified version of ManNet, referred to as subManNet, to directly output the sparse analog precoder for SC-HBF. The proposed schemes can also be applied to the fixed SC-HBF architecture.

•

We present simulation results demonstrating that the ManNet-based FC-HBF scheme attains better performance in much less time and with much lower computational complexity than the conventional MO-AltMin [10] and AO [11] approaches. In particular, the proposed ManNet and subManNet-aided SC-HBF schemes achieve performance similar to that of FC-HBF, and much better than semideﬁnite relaxation-based alternating minimization (SDR-AltMin) [10].

I-C Paper Organization and Notation

The rest of the paper is organized as follows. Section II presents the signal and channel models, and the considered design problems. Sections III and IV detail the proposed FC-HBF and SC-HBF designs, respectively. Numerical results are given in Section V, while Section VI concludes the paper.

Throughout the paper, numbers, vectors, and matrices are denoted by lower-case, boldface lower-case, and boldface upper-case letters, respectively, while $[{\mathbf{A}}]_{i,j}$ represents the $(i,j)$ -th entry of matrix ${\mathbf{A}}$ . We denote by $(\cdot)^{T}$ and $(\cdot)^{H}$ the transpose and the conjugate transpose of a matrix or vector, respectively, and ${\mathbf{A}}^{\dagger}$ is the pseudo-inverse of a matrix ${\mathbf{A}}$ . The matrix $\mathrm{diag}\{{\mathbf{a}}_{1},\ldots,{\mathbf{a}}_{N}\}$ is block diagonal with diagonal columns ${\mathbf{a}}_{1},\ldots,{\mathbf{a}}_{N}$ . Furthermore, $\left|\cdot\right|$ denotes either the absolute value of a scalar or the cardinality of a set, and $\odot$ represents the Hadamard product. $\mathcal{(C)N}(\mu,\sigma^{2})$ denotes a (complex) normal distribution with mean $\mathbf{\mu}$ and variance $\sigma^{2}$ , while $\mathcal{U}[a,b]$ denotes a uniform distribution over given range $[a,b]$ .

II Signal Model and Problem Formulation

II-A Signal Model

We consider the downlink of a point-to-point wideband mMIMO-OFDM system, where the base station (BS) and the mobile station (MS) are equipped with $N_{\text{t}}$ and $N_{\text{r}}$ antennas, respectively. Let ${\mathbf{s}}[k]\in{\mathbb{C}}^{N_{\text{s}}\times 1}$ denote the $N_{\text{s}}$ -dimensional transmit vector from the BS to the MS on the $k$ -th subcarrier, with $\mathbb{E}\left\{{\mathbf{s}}[k]{\mathbf{s}}[k]^{H}\right\}={\mathbf{I}}_{N_{\text{s}}}$ , $k=1,2,\ldots,K$ , where $K$ is the number of subcarriers. The BS employs a frequency-flat analog precoder ${\mathbf{F}}_{\text{RF}}\in\mathbb{C}^{N_{\text{t}}\times N_{\text{RF}}}$ and a frequency-dependent digital baseband precoder ${\mathbf{F}}_{\text{BB}}[k]\in\mathbb{C}^{N_{\text{RF}}\times N_{\text{s}}}$ , where $N_{\text{RF}}$ is the number of RF chains at the BS, $N_{\text{s}}\leq N_{\text{RF}}\leq N_{\text{t}}$ , and the normalized transmit power constraint at the BS is given as $\left\lVert{\mathbf{F}}_{\text{RF}}{\mathbf{F}}_{\text{BB}}[k]\right\rVert_{F}^{2}=N_{\text{s}},\forall k$ . To focus on the design of hybrid precoders, we assume that $N_{\text{r}}$ is relatively small so that a fully digital combiner ${\mathbf{V}}[k]\in{\mathbb{C}}^{N_{\text{r}}\times N_{\text{s}}}$ is employed at the MS receiver for the $k$ -th subcarrier. The post-processed signal at the MS is expressed as

[TABLE]

where $\rho$ denotes the average received power, ${\mathbf{n}}[k]\sim\mathcal{CN}(\mathbf{0},\sigma^{2}_{\text{n}}\textbf{{I}}_{N_{\text{r}}})$ is additive white Gaussian noise (AWGN) at the MS, and ${\mathbf{H}}[k]$ is the channel matrix at the $k$ -th subcarrier.

We adopt the extended Saleh-Valenzuela channel model and express ${\mathbf{H}}[k]$ as [10]

[TABLE]

In (2), $\xi=\sqrt{\frac{N_{\rm r}N_{\rm t}}{P}}$ and $f_{k}=f_{\text{c}}+\frac{\text{BW}(2k-1-K)}{2K}$ where BW and $f_{\text{c}}$ represent the system bandwidth and center frequency; $P$ is the number of propagation paths; $\alpha_{p}$ and $\tau_{p}$ are the complex gain and time-of-arrival (ToA) of the $p$ -th path; $\phi^{\text{t}}_{p}(\theta^{\text{t}}_{p})$ and $\phi^{\text{r}}_{p}(\theta^{\text{r}}_{p})$ represent the azimuth (elevation) angles of departure (AoDs) and arrivals (AOAs) of the $p$ -th path; ${\mathbf{a}}_{\text{t}}\in\mathbb{C}^{N_{\text{t}}\times 1}$ and ${\mathbf{a}}_{\text{r}}\in\mathbb{C}^{N_{\text{r}}\times 1}$ denote the transmit and receive array response vectors, respectively. We assume that the BS is equipped with a UPA of size $N_{\text{t}}^{\text{h}}\times N_{\text{t}}^{\text{v}}$ , where $N_{\text{t}}^{\text{h}}$ and $N_{\text{t}}^{\text{v}}$ are the numbers of antennas in the horizontal and vertical dimensions, and $N_{\text{t}}^{\text{h}}N_{\text{t}}^{\text{v}}=N_{\text{t}}$ . We assume half-wavelength antenna spacing at the BS, and thus, ${\mathbf{a}}_{\text{t}}(\theta^{\text{t}}_{p},\phi^{\text{t}}_{p},f_{k})$ is given as [10]

[TABLE]

where $i_{\text{h}}\in[0,N_{\text{t}}^{\text{h}})$ and $i_{\text{v}}\in[0,N_{\text{t}}^{\text{v}})$ denote the antenna indices on the horizontal and vertical dimensions, respectively. The array response vector ${\mathbf{a}}_{\text{r}}(\theta^{\text{r}}_{p},\phi^{\text{r}}_{p},f_{k})$ at the MS are modeled similarly.

II-B FC-HBF and SC-HBF Architectures

We consider both FC-HBF and SC-HBF phase-shifter-based architectures. In the former, each RF chain is connected to all $N_{\text{t}}$ antennas, requiring a total of $N_{\text{RF}}N_{\text{t}}$ phase shifters. In this case, the analog precoder is constrained as

[TABLE]

where $\zeta_{m,n}$ represents the effect of the phase shifter between the $n$ -th RF chain and the $m$ -th antenna.

In the SC-HBF architecture, each RF chain only connects to a subset of $M\triangleq\frac{N_{\text{t}}}{N_{\text{RF}}}$ antennas to reduce the hardware complexity and power consumption (assuming that $\frac{N_{\text{t}}}{N_{\text{RF}}}$ is an integer for simplicity). Such an analog network requires only $N_{\text{t}}$ phase shifters in total, which is a factor of $N_{\text{RF}}$ lower than FC-HBF. We assume a dynamic sub-connected architecture in which RF chains are connected to non-overlapping subsets of antennas. In this case, the sub-connected analog precoder is constrained as

[TABLE]

i.e., the $(m,n)$ -th entry of ${\mathbf{F}}_{\text{RF}}$ can be either a non-zero (unit-modulus) coefficient, when the $n$ -th RF chain is connected to the $m$ -th antenna, or zero otherwise. Furthermore, in each row and column of ${\mathbf{F}}_{\text{RF}}$ , there are only a single and $M$ nonzero elements, respectively. Note that the conventional fixed SC-HBF architecture is a special case of the dynamic one, i.e., when the $n$ -th RF chain is connected to $M$ adjacent antennas indexed from $(n-1)M+1$ to $nM$ . In this case, we have ${\mathbf{F}}_{\text{RF}}=\text{blkdiag}\left\{\bar{\mathbf{f}}_{1},\ldots,\bar{\mathbf{f}}_{n},\ldots,\bar{\mathbf{f}}_{N_{\text{RF}}}\right\}$ , where $\bar{\mathbf{f}}_{n}=\left[f_{1,n},\ldots,f_{M,n}\right]^{T}$ , as considered in [10].

Compared to the fixed SC-HBF architecture, the dynamic approach additionally requires $N_{\text{t}}$ switches in the analog precoding network to dynamically configure the connections between the RF chains and the antennas. However, the switches do not significantly impact the total power consumption of the system. The power consumption of a typical switch is 6 times less than that of a phase shifter and $40$ times less than a digital-to-analog converter (DAC) [50, 9]. Furthermore, low-power, low-cost, and high-speed tunable switches can be used [9, 51, 52] in dynamic SC-HBF structures.

II-C Problem Formulation

Based on (1), the average per-subcarrier achievable SE for Gaussian symbols is given by [10]

[TABLE]

We aim at designing the precoders and combiners $\{{\mathbf{F}}_{\text{RF}},{\mathbf{F}}_{\text{BB}}[k],{\mathbf{V}}[k]\}$ to maximize $R$ , which is challenging due to the strong coupling among the variables. However, given $\{{\mathbf{F}}_{\text{RF}},{\mathbf{F}}_{\text{BB}}[k]\}$ , the optimal solution for ${\mathbf{V}}[k]$ is the matrix whose columns are the $N_{\text{s}}$ left singular vectors corresponding to the $N_{\text{s}}$ largest singular values of ${\mathbf{H}}[k]{\mathbf{F}}_{\text{RF}}{\mathbf{F}}_{\text{BB}}[k]$ [53]. Therefore, we focus on the designs of the hybrid precoders $\{{\mathbf{F}}_{\text{RF}},{\mathbf{F}}_{\text{BB}}[k]\}$ in the sequel.

The SE maximizing hybrid precoding design can be approximately achieved via the following optimization [12, 10]:

[TABLE]

where ${\mathbf{F}}_{\text{opt}}[k]\in\mathbb{C}^{N_{\text{t}}\times N_{\text{s}}}$ is the unconstrained optimal digital precoder at the $k$ -th subcarrier, whose columns are the $N_{\text{s}}$ right singular vectors corresponding to the $N_{\text{s}}$ largest singular values of ${\mathbf{H}}[k]$ and scaled with water-filling power factors. In (7b), the feasible set $\mathcal{A}$ of the analog precoder can be either $\mathcal{A}_{\text{full}}$ or $\mathcal{A}_{\text{sub}}$ , defined in (4) and (5), respectively, depending on the HBF architecture. This constraint enforces the unit modulus of the analog precoding coefficients and the configuration of the sub-connected analog network. The per-subcarrier transmit power is constrained in (7c).

Problem (7) is a non-convex matrix factorization problem, and joint optimization of ${\mathbf{F}}_{\text{RF}}$ and $\{{\mathbf{F}}_{\text{BB}}[k]\}_{k=1}^{K}$ is complicated due to constraint (7b). MO-AltMin [10] and OMP [12] are two conventional model-based algorithms for tackling (7). As discussed earlier, MO-AltMin is highly complex and converges slowly when the system dimensions are large. In contrast, OMP maintains low complexity, but it has unsatisfactory performance. We overcome these deficiencies by proposing an efficient deep unfolding approach next.

III Proposed FC-HBF Design

We first focus on the design of FC-HBF, i.e., the design in (7) with ${\mathbf{F}}_{\text{RF}}\in\mathcal{A}_{\text{full}}$ . To this end, we propose a deep unfolding approach referred to as ManNet-based FC-HBF. Its main idea is to unfold the MO-AltMin algorithm, estimating the solution to ${\mathbf{F}}_{\text{RF}}$ using ManNet, an unfolding DNN designed based on PGD optimization.

III-A Proposed ManNet-Based FC-HBF Approach

III-A1 Main Idea

In the proposed approach, we apply the iterative alternating minimization method of [10]. Specifically, in each iteration, we first optimize ${\mathbf{F}}_{\text{RF}}$ with ${\mathbf{F}}_{\text{BB}}[k]$ given and constraint (7c) omitted. Then we design ${\mathbf{F}}_{\text{BB}}[k]$ to meet the constraint given the optimized ${\mathbf{F}}_{\text{RF}}$ . Thus, we first consider the following problem:

[TABLE]

where the quadratic form of the objective function is introduced without affecting the solution. Let us denote

[TABLE]

Then, the objective function in (8) can be re-expressed as

[TABLE]

Furthermore, by denoting

[TABLE]

with $\mathfrak{R}{\left(\cdot\right)}$ and $\mathfrak{I}{\left(\cdot\right)}$ representing the real and imaginary parts of a complex vector/matrix, respectively, we can write

[TABLE]

Define the transformation

[TABLE]

which transforms the complex-valued matrix ${\mathbf{F}}_{\text{RF}}$ into the real-valued vector ${\mathbf{x}}$ and vice versa, respectively. With the newly introduced variables, the optimal solution to problem (8) admits the LS form

[TABLE]

Based on (18), a deep unfolding DNN of $L$ layers is designed to mimic the PGD algorithm to approximate ${\mathbf{x}}^{\star}$ . Specifically, let ${\mathbf{x}}_{\ell}$ be the output of the $\ell$ -th layer of the DNN. From (18), ${\mathbf{x}}_{\ell}$ can be produced as [54]

[TABLE]

where $\delta_{\ell}$ denotes a step size, $\mathcal{T}_{\ell}(\cdot)$ represents a nonlinear projection operator, and in the last equality we denote $\bar{{\mathbf{z}}}\triangleq\sum_{k=1}^{K}{\mathbf{B}}[k]^{T}{\mathbf{z}}[k]$ and $\bar{{\mathbf{B}}}[k]\triangleq{\mathbf{B}}[k]^{T}{\mathbf{B}}[k],\forall k$ . The relationship in (19) motivates a DNN model to learn ${\mathbf{x}}^{\star}$ wherein the output of a given layer (i.e., ${\mathbf{x}}_{\ell}$ in the $\ell$ -th layer) results from a nonlinear projection applied to the output of the previous layer (i.e., ${\mathbf{x}}_{\ell-1}$ in the $(\ell-1)$ -th layer) and other given information, including $\bar{{\mathbf{z}}}$ and $\{\bar{{\mathbf{B}}}[k]\}$ which is short for $\{\bar{{\mathbf{B}}}[k]\}_{k=1}^{K}$ . The nonlinear projection is performed with trainable parameters, i.e., the weights of the DNN. Applied over multiple layers, the DNN can be structured and trained such that its final output, i.e., ${\mathbf{x}}_{L}$ , will be a good estimate of ${\mathbf{x}}^{\star}$ . In the following, we develop such an efficient DNN architecture referred to as ManNet.

III-A2 ManNet Architecture

Denote

[TABLE]

and rewrite (19) as

[TABLE]

We propose ManNet as a network of $L$ layers defined by (21) with the objective of learning ${\mathbf{x}}^{\star}$ . It takes ${\mathbf{x}}_{\ell-1}$ and ${\mathbf{u}}_{\ell-1}$ as the input of the $\ell$ -th layer, and outputs ${\mathbf{x}}_{\ell}$ as the sum of the outputs of two other sub-networks based on the two input vectors ${\mathbf{x}}_{\ell-1}$ and ${\mathbf{u}}_{\ell-1}$ in (21). Importantly, the $n$ -th element of ${\mathbf{x}}_{\ell}$ only depends on the $n$ -th elements of ${\mathbf{x}}_{\ell-1}$ and ${\mathbf{u}}_{\ell-1}$ . Thus, only the nodes (or neurons) at the same vertical level between the layers are connected making ManNet a sparsely connected DNN. Furthermore, we define the activation function

[TABLE]

where $\sigma(\cdot)$ is the rectified linear unit (ReLU) activation function, and $t$ is a hyperparameter. This guarantees that the amplitudes of the elements of ${\mathbf{x}}_{\ell}$ are in the range $[-1,1],\forall t$ , i.e., $\left|x_{i}\right|\leq 1,i=1,\ldots,2N_{\text{t}}N_{\text{RF}}$ .111The activation function $\mathrm{tanh}(x)=\frac{e^{x}-e^{-x}}{e^{x}+e^{-x}}$ can also output values in $[-1,1]$ , as seen in Fig. 1. However, its slope is fixed, causing a fixed mapping when applying the activation function. We have found via simulation that by proper fine-tuning of $t$ , $\psi_{t}(x)$ provides better performance than $\mathrm{tanh}(x)$ . As a result, its corresponding complex-valued matrix representation, denoted as ${\mathbf{F}}_{\text{RF}}^{(\ell)}=\mathcal{V}^{-1}({\mathbf{x}}_{\ell})$ , has elements satisfying $|[{\mathbf{F}}_{\text{RF}}^{(\ell)}]_{m,n}|\leq\sqrt{2},\ \forall m,n,\ell$ . As this does not immediately ensure ${\mathbf{F}}_{\text{RF}}^{(\ell)}\in\mathcal{A}_{\text{full}}$ as constrained in (8b), the final output of the DNN ( ${\mathbf{x}}_{L}$ ) is normalized to produce a solution ${\mathbf{F}}_{\text{RF}}=\mathcal{V}^{-1}({\mathbf{x}}_{L})$ satisfying (8b).

Let $\{{\mathbf{w}}_{\ell,1},{\mathbf{w}}_{\ell,2}\}_{\ell=1}^{L}$ be the weight vectors of the two sub-networks associated with inputs ${\mathbf{x}}_{\ell-1}$ and ${\mathbf{u}}_{\ell-1}$ in the $\ell$ -th layer of ManNet. A detailed network architecture illustrating the operation of each layer of ManNet is shown in Fig. 2(b).

III-A3 Training ManNet

We employ an unsupervised training approach for ManNet with the loss function

[TABLE]

which sums the total weighted objective values of all $L$ layers. The DNN is trained to optimize the parameter set $\{{\mathbf{w}}_{\ell,1},{\mathbf{w}}_{\ell,2}\}_{\ell=1}^{L}$ such that $\mathcal{L}\left(\{{\mathbf{w}}_{\ell,1},{\mathbf{w}}_{\ell,2}\}_{\ell=1}^{L}\right)$ is minimized, which also directly minimizes the objective function in (18) at the network output ${\mathbf{x}}_{\ell}={\mathbf{x}}_{L}$ . We note here that, otherwise, if supervised training were used, it would require the implementation of a conventional high-complexity HBF scheme to obtain the training labels, i.e., the analog precoding coefficients. This would dramatically increase the training complexity. Because optimal solutions to obtain the labels are unavailable, employing sub-optimal solutions for supervised training may limit the performance of ManNet.

In Algorithm 1, we summarize the ManNet training process using a training data set $\mathcal{D}$ . To initialize the training, the weight vectors are first randomly generated from the distribution $\mathcal{N}(0,0.01)$ , and an initial learning rate is set. Then, ManNet is trained over $\mathcal{E}$ epochs, each using $\mathcal{B}$ batches $\{\mathcal{H}^{(b)}\}_{b=1}^{\mathcal{B}}$ , where $\mathcal{H}^{(b)}=\left\{\{{\mathbf{H}}[k]\}_{1},\ldots,\{{\mathbf{H}}[k]\}_{\left|\mathcal{H}^{(b)}\right|}\right\}$ , and $\left|\mathcal{H}^{(b)}\right|$ denotes the training batch size. For the $b$ -th batch, we randomly generate ${\mathbf{F}}_{\text{RF}}^{(b,0)}$ , and $\{{\mathbf{F}}_{\text{BB}}[k]\}^{(b,0)}$ is obtained via the LS solution

[TABLE]

where ${\mathbf{F}}_{\text{opt}}[k]^{(b)}$ is the optimal fully digital precoder for the channels at the $k$ -th subcarrier in $\mathcal{H}^{(b)}$ , and $\mathbf{X}^{(b,i)}$ denotes the data $\mathbf{X}$ in the $b$ -th batch of the $i$ -th iteration. From step 6, the iterative process of optimizing the ManNet weights is performed. Specifically, in the $i$ -th iteration, for given ${\mathbf{F}}_{\text{RF}}^{(b,i)}$ and $\{{\mathbf{F}}_{\text{BB}}[k]\}^{(b,i)}$ , the real-valued ${\mathbf{x}}^{(b,i)}$ , $\{{\mathbf{z}}[k]^{(b,i)}\}$ , and $\{{\mathbf{B}}[k]^{(b,i)}\}$ are constructed based on (9)–(15) in step 7, allowing computation of $\bar{{\mathbf{z}}}^{(b,i)}$ and $\{\bar{{\mathbf{B}}}[k]^{(b,i)}\}$ in steps 8 and 9, respectively. Steps 10–16 update $\hat{{\mathbf{x}}}_{\ell}^{(b,i)}$ and the loss value, which is then used in an optimizer to update the weights in step 18. It is seen that the training for each data batch is an iterative process over $\mathcal{I}_{\text{net}}^{\text{train}}$ iterations. After each iteration, ${\mathbf{F}}_{\text{RF}}^{(b,i)}$ and $\{{\mathbf{F}}_{\text{BB}}[k]\}^{(b,i)}$ are updated and utilized for the next set of training iterations until $\mathcal{I}_{\text{net}}^{\text{train}}$ iterations are completed. This iterative approach is efficient in reducing the amount of training data and accelerating the convergence, as we empirically show in Section V.

III-B Overall ManNet-Based FC-HBF Algorithm

Once the offline training process is completed, ManNet with the trained weight vectors is readily applied to online FC-HBF design. We refer to this approach as ManNet-based FC-HBF, and it is summarized in Algorithm 2. Specifically, we generate the initial analog precoder and compute the digital one in step 1. From step 2, the unfolding HBF design is performed over $\mathcal{I}_{\text{net}}$ iterations. In steps 3–5, ${\mathbf{x}}$ , $\{{\mathbf{z}}[k]\}$ , and $\{{\mathbf{B}}[k]\}$ are obtained to compute $\bar{{\mathbf{z}}}$ and $\{\bar{{\mathbf{B}}}[k]\}$ in steps 4 and 5, respectively. After that, ManNet iteratively executes steps 6–10 to construct the outputs of its layers. Note that only element-wise multiplications between the weight and input vectors are required, as seen in step 8 and Fig. 2. The final output of ManNet, i.e., ${\mathbf{x}}_{L}$ , is reconstructed as the feasible solution to ${\mathbf{F}}_{\text{RF}}$ in step 11, and the ${\mathbf{F}}_{\text{BB}}[k]$ are updated via LS, i.e.,

[TABLE]

The solutions for ${\mathbf{F}}_{\text{RF}}$ and ${\mathbf{F}}_{\text{BB}}[k]$ are then utilized for the next iteration until $\mathcal{I}_{\text{net}}$ iterations are completed. Finally, with ${\mathbf{F}}_{\text{RF}}$ obtained, the optimal digital precoder directly maximizing the SE in (6) can be solved by the problem

[TABLE]

where

[TABLE]

$\tilde{{\mathbf{H}}}\triangleq{\mathbf{H}}{\mathbf{F}}_{\text{RF}}$ , and ${\mathbf{Q}}\triangleq{\mathbf{F}}_{\text{RF}}^{H}{\mathbf{F}}_{\text{RF}}$ . This problem has a well-known water-filling solution:

[TABLE]

where the columns of $\tilde{{\mathbf{U}}}$ are taken from the right singular vectors corresponding to the $N_{\text{s}}$ largest singular values of $\tilde{{\mathbf{H}}}{\mathbf{Q}}^{-\frac{1}{2}}$ , and $\tilde{\boldsymbol{\Gamma}}$ is a diagonal matrix whose elements are defined by the power allocated to the $N_{\text{s}}$ data streams [11]. In Algorithm 2, the final solution to $\{{\mathbf{F}}_{\text{BB}}[k]\}$ is obtained based on (27) in the last iteration, as shown in step 12. We illustrate the entire proposed deep unfolding framework of the ManNet-based FC-HBF design in Fig. 2(a).

We note that the modular architecture of our unfolded network allows numbers of iterations in the training and online application phases of ManNet, i.e., $\mathcal{I}_{\text{net}}^{\text{train}}$ and $\mathcal{I}_{\text{net}}$ in Algorithms 1 and 2, respectively, to be different. In particular, we noted that during training, where the goal is to set the weights of ManNet, reliable learning can be achieved with just a few iterations, e.g., $\mathcal{I}_{\text{net}}^{\text{train}}=3$ , which are also enough for fast convergence. During inference, when the goal is to set the hybrid precoders, the setting of $\mathcal{I}_{\text{net}}$ can balance performance-complexity tradeoff: while the performance of the ManNet-based FC-HBF scheme improves with $\mathcal{I}_{\text{net}}$ , its computational complexity linearly increases with $\mathcal{I}_{\text{net}}$ , as will be shown next.

III-C Complexity Analysis

We herein analyze the computational complexity of the proposed ManNet-based FC-HBF scheme in Algorithm 2. It is observed from (11) and (15) that ${\mathbf{B}}[k]$ is a sparse matrix, in which only $2N_{\text{RF}}$ and $2N_{\text{s}}$ (out of $2N_{\text{t}}N_{\text{RF}}$ and $2N_{\text{t}}N_{\text{s}}$ ) elements in each row and column, respectively, are nonzero real-valued numbers. Thus, the complexity for computing $\bar{{\mathbf{z}}}$ and $\{\bar{{\mathbf{B}}}[k]\}$ in steps 4 and 5 is only $\mathcal{O}(KN_{\text{s}}N_{\text{RF}})$ and $\mathcal{O}(KN_{\text{RF}}^{2}N_{\text{s}})$ , respectively. Furthermore, $\bar{{\mathbf{B}}}[k]$ has only $2N_{\text{RF}}$ nonzero elements in each row and column, and hence step 7 requires a complexity of $\mathcal{O}(N_{\text{t}}+2KN_{\text{s}}N_{\text{RF}})$ . The weighting in step 8 performs only element-wise vector multiplication/addition, which has a complexity of $3\mathcal{O}(N_{\text{t}}N_{\text{RF}})$ . In step 12, obtaining $\{{\mathbf{F}}_{\text{BB}}[k]\}$ with (25) has a complexity of $\mathcal{O}(N_{\text{t}}KN_{\text{RF}}^{2})$ , while the complexity of (27) is $2\mathcal{O}(N_{\text{t}}KN_{\text{RF}})$ . As a result, the total complexity of Algorithm 2 can be approximated as

[TABLE]

Compared to MO-AltMin [10], AO [11, 41], and OMP [12], the proposed ManNet-based FC-HBF scheme has low complexity. These approaches require complexities of

[TABLE]

respectively, where $\mathcal{I}_{\text{MO}}^{\text{in}}$ , $\mathcal{I}_{\text{MO}}^{\text{out}}$ , and $\mathcal{I}_{\text{AO}}$ denote the number of inner and outer iterations for MO-AltMin and the number of iterations for AO, respectively. The number of iterations for the analog precoding designs in these schemes is $\mathcal{I}_{\text{MO}}^{\text{out}}\mathcal{I}_{\text{MO}}^{\text{in}}$ and $N_{\text{t}}N_{\text{RF}}\mathcal{I}_{\text{AO}}$ respectively, while that of the proposed ManNet-based design is only $\mathcal{I}_{\text{net}}L$ . In general, both $\mathcal{I}_{\text{net}}$ and $L$ are of the same order as $N_{\text{RF}}$ , and thus, $\mathcal{I}_{\text{net}}L\ll N_{\text{t}}N_{\text{RF}}\mathcal{I}_{\text{AO}}$ and $\mathcal{I}_{\text{net}}L\ll\mathcal{I}_{\text{MO}}^{\text{in}}\mathcal{I}_{\text{MO}}^{\text{out}}$ . For example, in a simulation with $N_{\text{t}}=128$ , $N_{\text{r}}=N_{\text{RF}}=N_{\text{s}}=2$ , and $K=128$ , we found that $\mathcal{I}_{\text{net}}=10$ and $L=7$ are sufficient for ManNet-based FC-HBF to achieve satisfactory performance, whereas AO and MO-AltMin require up to $N_{\text{t}}N_{\text{RF}}\mathcal{I}_{\text{AO}}=250$ and $\mathcal{I}_{\text{MO}}^{\text{out}}\mathcal{I}_{\text{MO}}^{\text{in}}=500$ iterations to converge, respectively (this will be shown in Section V, Fig. 4). Therefore, the proposed algorithm performs much faster than MO-AltMin, and its computational complexity is considerably lower than MO-AltMin and AO, and comparable with that of OMP.

IV Proposed SC-HBF Designs

Next, we present the deep unfolding based dynamic SC-HBF design. As the fixed SC-HBF architecture is a special case of the dynamic one, below we present the general solution to the latter. We first consider the following problem:

[TABLE]

Compared to the FC-HBF design in (8), problem (29) inherits the nonconvexity due to the unit-modulus constraint of the nonzero analog precoding coefficients. Furthermore, unlike the cases of FC-HBF and fixed SC-HBF, the connections between the RF chains and antennas are also design variables in this problem. The joint optimization of the RF chain-antenna connections, ${\mathbf{F}}_{\text{RF}}$ , and ${\mathbf{F}}_{\text{BB}}[k]$ is challenging. Herein we propose efficient algorithms to solve (29) with the main idea being to decouple the design variables.

IV-A ManNet-based Heuristic FC-HBF Design

Let ${\mathbf{C}}\in\mathbb{N}^{N_{\text{t}}\times N_{\text{RF}}}$ denote the mapping matrix defining the connections between the $N_{\text{RF}}$ RF chains and $N_{\text{t}}$ antennas such that

[TABLE]

With the introduction of variable ${\mathbf{C}}$ , the dynamic SC-HBF optimization can be rewritten as

[TABLE]

Note that in this problem, the sub-connected structure constraint on the analog precoder, i.e., (29b), has been relaxed, as seen in (33b). This efficiently decouples the designs of the RF chain/antenna connections and the analog precoder. Because ${\mathbf{C}}$ is a matrix of binary entries, its optimal solution could be found by exhaustive search over all possibilities, but with a prohibitive complexity (exponential in $N_{\text{t}}N_{\text{RF}}$ ). To avoid this, we investigate the achievable SE of the analog precoders given as $R_{\text{RF}}=\frac{1}{K}\sum_{k=1}^{K}R_{\text{RF},k}$ , where

[TABLE]

It is observed that for a given ${\mathbf{H}}[k]$ , to achieve the highest signal-to-noise ratio (SNR), ${\mathbf{C}}$ should be designed to match the nonzero entries in ${\mathbf{F}}_{\text{RF}}$ with the “best” coefficients of ${\mathbf{H}}[k]$ , i.e., those with the largest absolute values. Based on this observation, we propose Algorithm 3 to determine ${\mathbf{C}}$ for any ${\mathbf{H}}[k]$ . Furthermore, because of the relaxation in (33b), ManNet can be used to produce $\tilde{{\mathbf{F}}}_{\text{RF}}\in\mathcal{A}_{\text{full}}$ . Then, for each ${\mathbf{H}}[\tilde{k}]$ , with $\tilde{k}\in\tilde{\mathcal{K}}\subseteq\{1,2,\ldots,K\}$ , ${\mathbf{C}}$ is determined using Algorithm 3, and the ${\mathbf{F}}_{\text{BB}}[k]$ are found using (27). The final solutions for ${\mathbf{F}}_{\text{RF}}$ and $\{{\mathbf{F}}_{\text{BB}}[k]\}$ are those that provide the best performance, i.e., the largest SE. This heuristic ManNet-based SC-HBF approach is summarized in Algorithm 4.

We note that although the proposed ManNet-based SC-HBF scheme can avoid an exhaustive search for ${\mathbf{C}}$ for each channel ${\mathbf{H}}[k]$ , it still requires $|\tilde{\mathcal{K}}|$ iterations to obtain ${\mathbf{F}}_{\text{RF}}^{(\tilde{k})}$ and $\{{\mathbf{F}}_{\text{BB}}[k]\}^{(\tilde{k})},\ (\tilde{k}\in\tilde{\mathcal{K}})$ . We will show later that such an iterative process yields very satisfactory performance for SC-HBF, at the expense of increased complexity and run time.

IV-B Low-Complexity subManNet-based SC-HBF

Here we propose a computationally efficient SC-HBF design to avoid the iterative procedure as well as the extra complexity to produce $\tilde{{\mathbf{F}}}_{\text{RF}}\in\mathcal{A}_{\text{full}}$ , as done in Algorithm 4. This can be achieved if a good channel is chosen in advance to design ${\mathbf{C}}$ , and if the employed DNN only generates the nonzero coefficients of ${\mathbf{F}}_{\text{RF}}\in\mathcal{A}_{\text{sub}}$ . These assumptions motivate a subcarrier selection scheme and the design of subManNet, a simplified version of ManNet proposed below.

IV-B1 Subcarrier Selection

First, we observe from (34) that the transmissions via different subcarriers have different contributions to the total achievable SE. Specifically, let $R_{\text{RF},k^{\star}}$ be the maximum SE of all the sub-carriers, i.e., $R_{\text{RF},k^{\star}}=\max\{R_{\text{RF},1},\ldots,R_{\text{RF},K}\}$ . Then, $R_{\text{RF},k^{\star}}$ has the most significant contribution to $R_{\text{RF}}$ . On the other hand, for any given ${\mathbf{F}}_{\text{RF}}\in\mathcal{A}_{\text{sub}}$ , the ${\mathbf{F}}_{\text{BB}}[k]$ can be optimally found using the closed-form solution in (27). These observations motivate us to design ${\mathbf{C}}$ to maximize $R_{\text{RF},k^{\star}}=\log_{2}\text{det}({\mathbf{I}}_{N_{\text{r}}}+\frac{\rho}{\sigma^{2}_{\text{n}}N_{\text{s}}}{\mathbf{H}}[k^{\star}]({\mathbf{C}}\odot{\mathbf{F}}_{\text{RF}})({\mathbf{C}}\odot{\mathbf{F}}_{\text{RF}})^{H}{\mathbf{H}}[k^{\star}]^{H})$ . Here, because of the unity-modulus constraints on the non-zero elements of ${\mathbf{F}}_{\text{RF}}$ , subcarrier $k^{\star}$ is chosen such that the channel ${\mathbf{H}}[k^{\star}]$ has the largest Frobenius norm among all the channels. Thus, ${\mathbf{C}}$ is determined based on ${\mathbf{H}}[k^{\star}]$ using Algorithm 3.

IV-B2 subManNet-based SC-HBF

Once ${\mathbf{C}}$ is determined, let

[TABLE]

where $\mathcal{V}$ is defined in (17). By similar transformations as in (9)–(36), we can rewrite the objective of problem (29) as

[TABLE]

Problem (29) is then transformed to

[TABLE]

This motivates us to specialize ManNet for SC-HBF design.

Specifically, we propose subManNet to learn and output ${\mathbf{x}}^{\star}$ in (37). In subManNet, the activation function is set to

[TABLE]

where $\psi_{t}(\cdot)$ is defined in (22) and ${\mathbf{u}}_{\ell-1}$ is modified as

[TABLE]

As a result, the $n$ -th nodes in both the sub-networks associated with input vectors ${\mathbf{x}}_{\ell-1}$ and $\tilde{{\mathbf{u}}}_{\ell-1}$ do not require any computations if $c_{n}=0$ . In other words, subManNet produces the output based on the predetermined RF chain/antenna connections specified in ${\mathbf{C}}$ . The offline training and online application of subManNet can be performed similarly to ManNet, except for the aforementioned modifications. We omit the detailed training process here but summarize the proposed subManNet-based SC-HBF design in Algorithm 5. Its first step is to design the mapping matrix ${\mathbf{C}}$ for the best channel ${\mathbf{H}}[k^{\star}]$ , and the remaining process is similar to Algorithm 2, except for the pre-processing of ${\mathbf{u}}_{\ell-1}$ . We outline the structure of subManNet in Fig. 2(c).

IV-C Complexity Analysis

In Algorithm 4, each iteration is performed with a complexity of $\mathcal{O}(2N_{\text{t}}N_{\text{r}}N_{\text{RF}})$ . This is mainly to solve $\{{\mathbf{F}}_{\text{BB}}[k]\}^{(\tilde{k})}$ with (27), while steps 3 and 4 require very few computations. Thus, we approximate the total complexity of Algorithm 4 as

[TABLE]

On the other hand, subManNet offers a complexity reduction by a factor of $N_{\text{RF}}$ compared to ManNet. This is consistent with the requirement of $N_{\text{RF}}$ times fewer phase shifters in the sub-connected architecture. Thus, the overall complexity of the subManNet-based SC-HBF scheme in Algorithm 5 is

[TABLE]

based on the complexity analysis of the ManNet-based FC-HBF scheme in Section III-C. In particular, subManNet inherits the fast convergence and low complexity of ManNet, i.e., it only requires small $\mathcal{I}_{\text{net}}$ and $L$ to achieve good performance. SDR-AltMin [10] requires complexities of $\mathcal{O}(KN_{\text{t}}N_{\text{s}})$ and $\mathcal{O}(KN_{\text{s}}^{3}N_{\text{RF}}^{3})$ to obtain the analog and digital precoders, respectively, in each iteration. Thus, its total complexity is $\mathcal{C}_{\textrm{SDR-AltMin}}=\mathcal{I}_{\text{SDR}}\mathcal{O}(N_{\text{t}}KN_{\text{s}}+KN_{\text{s}}^{3}N_{\text{RF}}^{3})$ , where $\mathcal{I}_{\text{SDR}}$ is the number of iterations for alternating updates of ${\mathbf{F}}_{\text{RF}}$ and ${\mathbf{F}}_{\text{BB}}[k]$ . Our simulations will show that the proposed design also performs better and much faster than SDR-AltMin.

Based on the fact that $N_{\text{t}},K\gg N_{\text{RF}},N_{\text{s}},N_{\text{r}},L$ , we present the approximate complexities of the discussed approaches in Table I to facilitate complexity comparisons. It can be seen that the proposed deep unfolding schemes, OMP, and SDR-AltMin have comparable complexities, which are all much lower than those of AO and MO-AltMin. In particular, the complexity of AO increases exponentially with $N_{\text{t}}$ .

V Simulation Results

In this section, we provide numerical results to demonstrate the performance of the proposed deep unfolding solutions for FC-HBF and SC-HBF designs. We first detail the simulation setup and benchmarks, after which we discuss the results in terms of SE and complexity.

V-A Simulation Setup and Training of DNNs

We assume scenarios with $N_{\text{t}}=\{16,32,64,128\}$ , $K=128$ , and $N_{\text{r}}=N_{\text{RF}}=N_{\text{s}}=2$ . The channel realizations are generated based on (2) with $P=4$ , $\phi^{\text{t}}_{p},\phi^{\text{r}}_{p}\sim\mathcal{U}[0^{\circ},360^{\circ})$ , $\theta^{\text{t}}_{p},\theta^{\text{r}}_{p}\sim\mathcal{U}[-90^{\circ},90^{\circ}]$ , $\alpha_{p}\sim\mathcal{CN}(0,1)$ [10], and $\tau_{p}\sim\mathcal{U}[0,\tau_{\max}]$ , where $\tau_{\max}=QT_{\text{s}}$ with $T_{\text{s}}$ being the sampling period and $Q$ being the cyclic prefix length, which is set to $\frac{K}{4}$ similar to IEEE 802.11ad [55, 56]. The center frequency and bandwidth are set to $f_{\text{c}}=300$ GHz and BW $=30$ GHz, respectively. ManNet and subManNet are implemented using Python with the Pytorch library and a Tesla V100-SXM2 processor. For the training phase, a learning rate of $0.0001$ is used with the Adam optimizer, and we set $L=\{4,5,6,7\}$ and $|\mathcal{D}|=\{400,500,600,700\}$ for $N_{\text{t}}=\{16,32,64,128\}$ , respectively. The SNR is defined as SNR $=\rho/\sigma_{\text{n}}^{2}$ . The results are averaged over $100$ iterations.

We first show the loss obtained in (23) during training ManNet and subManNet with $N_{\text{t}}=64$ and $N_{\text{r}}=N_{\text{RF}}=N_{\text{s}}=2$ in Fig. 3. Both networks are trained using Algorithm 1, but the latter employs the modified activation function (38) and input vector (39), as discussed earlier in Section IV-B. We consider $\mathcal{I}_{\text{net}}^{\text{train}}=\{1,3\}$ , corresponding to the non-iterative and iterative training approaches, respectively. It is seen for both the DNNs that the loss decreases and essentially converges, but at different speeds and to different values. Specifically, it is clear that with $\mathcal{I}_{\text{net}}^{\text{train}}=3$ , the DNNs converge rapidly after about $800$ batches. In contrast, when the non-iterative training is applied, they converge more slowly, and convergence has not been reached even after $1500$ batches. Because the objective $\sum_{k=1}^{K}\left\lVert{\mathbf{F}}_{\text{opt}}[k]-{\mathbf{F}}_{\text{RF}}{\mathbf{F}}_{\text{BB}}[k]\right\rVert^{2}_{\mathcal{F}}$ attained by FC-HBF is smaller than that of SC-HBF, it is reasonable that the converged loss of ManNet is smaller than that of subManNet. As the loss function (23) also measures the objective in (8) and (29), the convergence of the training loss reflects the abilities of ManNet and subManNet to solve problems (8) and (29), respectively. Note that in Fig. 3, the training loss is with respect to the total number of batches over all training epochs. Equivalently, the training losses for the iterative and non-iterative schemes have converged within $30$ and $50$ epochs, respectively.

V-B Performance of Proposed Deep Unfolding HBF Schemes

Here, we investigate the performance of the proposed deep unfolding FC-HBF and SC-HBF designs based on ManNet and subManNet in their online applications, i.e., in Algorithms 2–5. We train the DNNs over $\mathcal{I}_{\text{net}}^{\text{train}}=3$ iterations. For comparisons of FC-HBF designs with ManNet in Algorithm 2, we consider optimal fully digital beamforming (DBF), MO-AltMin [10], OMP [12, 13], and AO [14]. The dynamic SC-HBF designs with ManNet in Algorithm 4 and with subManNet in Algorithm 5 are compared with the SDR-AltMin scheme [10].

In Fig. 4, we compare the convergence of the considered schemes with $N_{\text{t}}=128$ , $N_{\text{r}}=N_{\text{RF}}=N_{\text{s}}=2$ , $K=128$ , SNR $=\{10,20\}$ dB, $L=7$ , and $\mathcal{I}_{\text{net}}=10$ . We note that OMP and optimal DBF are not iterative, so their performance is constant over the number of iterations. Among the iterative schemes, MO-AltMin converges the slowest, and it has not strictly converged after $500$ iterations. AO converges faster than MO-AltMin, but still requires about $250$ iterations and converges to unsatisfactory performance. In contrast, the performance of the proposed ManNet-based FC-HBF and subManNet-based SC-HBF methods improves rapidly and reaches satisfactory values after only tens of iterations. Particularly, among the sub-optimal schemes, ManNet-HBF achieves the highest SE. It is observed that the SE of ManNet-HBF increases step-by-step, over $\mathcal{I}_{\text{net}}=10$ steps, and reaches its maximum after $\mathcal{I}_{\text{net}}L=70$ iterations. This is because $L=7$ layers is the number of inner iterations used to perform steps 6–10 in Algorithm 2, and in these layers the performance does not improve. However, because the weights of the DNNs are applied once to generate the output, the maximum SE of ManNet and subManNet is reached after only $\mathcal{I}_{\text{net}}$ iterations. This figure clearly shows the advantages of the proposed scheme in accelerating HBF transceiver design and optimization.

In Figs. 5 and 6, we compare the SE performance attained by the proposed deep unfolding schemes, including ManNet- and subManNet-based FC-HBF and SC-HBF in Algorithms 2–5, with that of the optimal DBF, MO-AltMin, AO, OMP, SDR-AltMin, and the unfolded PGA approach [31] with $5$ iterations. Furthermore, we also show the performance of the dynamic SC-HBF based on ManNet without a heuristic search for ${\mathbf{C}}$ (referred to as “Dynamic SC-HBF, ManNet, with ${\mathbf{H}}[k^{\star}]"$ in the figures). In addition, we present the results for the fixed SC-HBF scheme based on ManNet (referred to as “Fixed SC-HBF, ManNet”), i.e., in which ${\mathbf{C}}$ is fixed to ${\mathbf{C}}=\mathrm{blkdiag}\{\mathbf{1}_{M},\ldots,\mathbf{1}_{M}\}$ , where $\mathbf{1}_{M}$ denotes a column vector of $M$ ones.

In Fig. 5, we set $N_{\text{t}}=128,N_{\text{r}}=N_{\text{RF}}=N_{\text{s}}=2$ , and $K=128$ . The convergence tolerance is set to $10^{-3}$ for the iterative MO-AltMin, AO, and SDR-AltMin approaches, and $\mathcal{I}_{\text{net}}=\{1,10\}$ is set for the ManNet-based FC-HBF scheme. Note that for $\mathcal{I}_{\text{net}}=1$ , ${\mathbf{F}}_{\text{RF}}$ is obtained directly using ManNet without an iterative update, and the ${\mathbf{F}}_{\text{BB}}[k]$ are solved directly using (27). For the heuristic ManNet-based SC-HBF scheme in Algorithm 4, we use $\tilde{\mathcal{K}}=\{1,3,5,\ldots,K-1\}$ . From Fig. 5, the following observations are made:

•

In Fig. 5(a), FC-HBF based on ManNet with $\mathcal{I}_{\text{net}}=10$ performs better than MO-AltMin and AO, and much better than five unfolded PGA iterations and OMP, even with only $\mathcal{I}_{\text{net}}=1$ iteration. At SNR $=10$ dB, the proposed ManNet-based FC-HBF scheme with $\mathcal{I}_{\text{net}}=10$ achieves $90.95\%$ of the optimal performance, while the performance of MO-AltMin, AO, unfolded PGA, and OMP are only at $90.38\%$ , $86.82\%$ , $83.16\%$ and $81.82\%$ of the optimum, respectively.

•

The heuristic dynamic SC-HBF design based on ManNet (i.e., Algorithm 4) provides superior performance, as seen in Fig. 5(b). The other deep unfolding SC-HBF schemes perform slightly worse than the heuristic one, but they all outperform SDR-AltMin for SNR $\geq-5$ dB. At SNR $=10$ dB, the proposed deep unfolding SC-HBF schemes achieve $90-93\%$ of the FC-HBF performance based on MO-AltMin, while that achieved by SDR-AltMin is only at $70\%$ .

•

SC-HBF designs based on ManNet perform better than that with subManNet. This is reasonable since the fully-connected analog precoder produced by ManNet is more reliable than the sub-connected version, as observed from Fig. 3. The dynamic ManNet-based SC-HBF algorithm performs just slightly better than the fixed version. We note here that larger gains can be attained with smaller $N_{\text{t}}$ , as will be shown next.

In Fig. 6, we plot the SE performance of the considered schemes for $N_{\text{t}}=\{16,32,64,128\}$ , $N_{\text{r}}=N_{\text{RF}}=N_{\text{s}}=2$ , $K=128$ , SNR = $10$ dB, and $\mathcal{I}_{\text{net}}=10$ . It is observed that OMP only performs well for small $N_{\text{t}}$ and has significant performance loss as $N_{\text{t}}$ increases. Among the sub-optimal FC-HBF schemes, the proposed ManNet-HBF approach achieves the best performance, which is slightly better than MO-AltMin and far better than AO and OMP for all considered $N_{\text{t}}$ . Comparing the SC-HBF schemes, the heuristic ManNet-based SC-HBF design has the best performance. The subManNet-based SC-HBF algorithm performs very close to the heuristic one for $N_{\text{t}}\leq 64$ . Furthermore, it is seen that compared to fixed SC-HBF, the gains achieved from dynamic SC-HBF are more significant for small and moderate $N_{\text{t}}$ . This is reasonable because as $N_{\text{t}}$ increases, all the sub-arrays become large and the beamforming gain is guaranteed even without the optimized connections between RF chains and antennas.

V-C Computational and Time Complexity Comparison

In Figs. 7 and 8, we compare the execution time and computational complexities of the considered schemes with the same simulation parameters as those for Fig. 6. The complexities are counted as the total number of additions and multiplications performed in the considered algorithms. The proposed deep unfolding schemes have low complexities thanks to ManNet and subManNet’s small numbers of iterations, layers, and the simple operations in each layer. In particular, their complexities are just as low as OMP and slightly higher than SDR-AltMin, but they offer much better performance, as discussed earlier in Section V-B. Among the proposed deep unfolding schemes, as expected, subManNet-based SC-HBF has the lowest complexity, and the heuristic ManNet-based SC-HBF approach requires the highest complexity due to the iterations required for the search. Compared to these algorithms, the complexities of MO-AltMin and AO are much higher, and that of AO increases exponentially with $N_{\text{t}}$ , whereas the complexity of the algorithms is almost linear with $N_{\text{t}}$ . This agrees with the analysis in Section IV-C.

Finally, we show the run time of the considered schemes in Fig. 8, but we omit the results for SDR-AltMin because they are very large (up to $822$ s for $N_{\text{t}}=128$ ), making it difficult to see the difference among the other schemes. SDR-AltMin employs CVX to solve for the ${\mathbf{F}}_{\text{BB}}[k]$ in each iteration, and it is thus extremely slow. Among the other methods, MO-AltMin is the slowest and is much slower than AO, OMP, and the proposed deep unfolding approaches, especially for large $N_{\text{t}}$ . This is because of its slow convergence (see Fig. 4) and nested iterations involving a line search. In contrast, the proposed deep unfolding algorithms execute very rapidly. With $N_{\text{t}}=128$ , while MO-AltMin requires more than $10$ s to execute, the time required by the non-heuristic ManNet and sub-ManNet-aided HBF schemes are only around $0.01$ s. The heuristic ManNet-based dynamic SC-HBF approach outlined in Algorithm 4 requires a longer run time than the ManNet and subManNet-aided SC-HBF schemes. Furthermore, despite the slow convergence, AO executes relatively fast because only arithmetic operations and element-wise normalization are performed in each iteration.

VI Conclusion

The nonconvexity and high-dimensional variables have imposed significant challenges to HBF designs in the literature. The available solutions have usually required cumbersome iterative procedures. We have overcome these difficulties by proposing efficient deep unfolding frameworks for FC-HBF and SC-HBF designs based on unfolding MO-AltMin and PGD. In these schemes, the low-complexity ManNet and subManNet approaches produce fully-connected and sub-connected analog precoders with only several layers and sparse connections in each, which explains the computational and time efficiency of the proposed algorithms. Our extensive simulation results demonstrate that compared to the state-of-the-art HBF algorithms, the proposed deep unfolding solutions for HBF designs have superior performance with lightweight implementation, low complexity, and fast execution. For future studies, deep unfolding models for a joint HBF design and channel estimation will be considered.

Bibliography56

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Giordani, M. Polese, M. Mezzavilla, S. Rangan, and M. Zorzi, “Toward 6G networks: Use cases and technologies,” IEEE Commun. Mag. , vol. 58, no. 3, pp. 55–61, 2020.
2[2] T. S. Rappaport, Y. Xing, O. Kanhere, S. Ju, A. Madanayake, S. Mandal, A. Alkhateeb, and G. C. Trichopoulos, “Wireless communications and applications above 100 G Hz: Opportunities and challenges for 6G and beyond,” IEEE Access , vol. 7, pp. 78 729–78 757, 2019.
3[3] X. Gao, L. Dai, and A. M. Sayeed, “Low RF-complexity technologies to enable millimeter-wave MIMO with large antenna array for 5G wireless communications,” IEEE Commun. Mag. , vol. 56, no. 4, pp. 211–217, 2018.
4[4] L. Dai, J. Tan, Z. Chen, and H. V. Poor, “Delay-phase precoding for wideband T Hz massive MIMO,” IEEE Trans. Wireless Commun. , vol. 21, no. 9, pp. 7271–7286, 2022.
5[5] S. S. Ioushua and Y. C. Eldar, “A family of hybrid analog–digital beamforming methods for massive MIMO systems,” IEEE Trans. Signal Process. , vol. 67, no. 12, pp. 3243–3257, 2019.
6[6] T. Gong, N. Shlezinger, S. S. Ioushua, M. Namer, Z. Yang, and Y. C. Eldar, “RF chain reduction for MIMO systems: A hardware prototype,” IEEE Syst. J. , vol. 14, no. 4, pp. 5296–5307, 2020.
7[7] E. Tasci, T. Zirtiloglu, A. Yasar, Y. C. Eldar, N. Shlezinger, and R. T. Yazicigil, “Robust task-specific beamforming with low-resolution AD Cs for power-efficient hybrid MIMO receivers,” ar Xiv preprint ar Xiv:2212.00107 , 2022.
8[8] N. Shlezinger, G. C. Alexandropoulos, M. F. Imani, Y. C. Eldar, and D. R. Smith, “Dynamic metasurface antennas for 6G extreme massive MIMO communications,” IEEE Wireless Commun. , vol. 28, no. 2, pp. 106–113, 2021.