Bayesian Optimal Data Detector for mmWave OFDM System with   Low-Resolution ADC

Hanqing Wang; Chao-Kai Wen; Shi Jin

arXiv:1704.03591·cs.IT·September 6, 2017

Bayesian Optimal Data Detector for mmWave OFDM System with Low-Resolution ADC

Hanqing Wang, Chao-Kai Wen, Shi Jin

PDF

TL;DR

This paper presents a Bayesian optimal data detection algorithm for mmWave OFDM systems with low-resolution ADCs, addressing non-linearity challenges and achieving near-optimal performance with efficient power allocation and channel estimation.

Contribution

It introduces a Bayesian optimal data detector tailored for low-resolution ADCs in mmWave OFDM, along with a power allocation scheme and a high-quality channel estimation method.

Findings

01

Detector performance approaches the infinite-resolution ADC limit.

02

Proposed power allocation minimizes average symbol error rate.

03

Channel estimation method reduces pilot overhead.

Abstract

Orthogonal frequency division multiplexing (OFDM) has been widely used in communication systems operating in the millimeter wave (mmWave) band to combat frequency-selective fading and achieve multi-Gbps transmissions, such as IEEE 802.15.3c and IEEE 802.11ad. For mmWave systems with ultra high sampling rate requirements, the use of low-resolution analog-to-digital converters (ADCs) (i.e., 1-3 bits) ensures an acceptable level of power consumption and system costs. However, orthogonality among sub-channels in the OFDM system cannot be maintained because of the severe non-linearity caused by low-resolution ADC, which renders the design of data detector challenging. In this study, we develop an efficient algorithm for optimal data detection in the mmWave OFDM system with low-resolution ADCs. The analytical performance of the proposed detector is derived and verified to achieve the…

Equations218

z \sim C N (z; μ, ν) = \frac{1}{π ν} e^{- \frac{∣ z - μ ∣ ^{2}}{ν}} .

z \sim C N (z; μ, ν) = \frac{1}{π ν} e^{- \frac{∣ z - μ ∣ ^{2}}{ν}} .

D z = ϕ (z) d z with ϕ (z) = \frac{1}{2 π} e^{- \frac{z ^{2}}{2}} .

D z = ϕ (z) d z with ϕ (z) = \frac{1}{2 π} e^{- \frac{z ^{2}}{2}} .

y = G F^{H} P^{\frac{1}{2}} s + n,

y = G F^{H} P^{\frac{1}{2}} s + n,

G = F^{H} diag (h) F,

G = F^{H} diag (h) F,

y = F^{H} diag (h^{'}) s + n .

y = F^{H} diag (h^{'}) s + n .

q_{j} = Q_{c} (y_{j}) = Q (y_{j}^{R}) + j Q (y_{j}^{I}) .

q_{j} = Q_{c} (y_{j}) = Q (y_{j}^{R}) + j Q (y_{j}^{I}) .

q = Q_{c} (F^{H} diag (h^{'}) s + n) .

q = Q_{c} (F^{H} diag (h^{'}) s + n) .

\overset{s}{^}_{j} = s \in S argmin \frac{q ~ _{j}}{h _{j}^{'}} - s^{2}, \mbox f or j = 1, 2, \dots, N .

\overset{s}{^}_{j} = s \in S argmin \frac{q ~ _{j}}{h _{j}^{'}} - s^{2}, \mbox f or j = 1, 2, \dots, N .

h_{j} = w^{RX} H_{j} w^{TX},

h_{j} = w^{RX} H_{j} w^{TX},

x = diag (h^{'}) s, z = F^{H} x

x = diag (h^{'}) s, z = F^{H} x

P (q ∣ s; h^{'}) = j = 1 \prod N P_{out} (q_{j} ∣ z_{j}) .

P (q ∣ s; h^{'}) = j = 1 \prod N P_{out} (q_{j} ∣ z_{j}) .

P_{out} (q_{j} ∣ z_{j}) = P (q_{j}^{R} ∣ z_{j}^{R}) P (q_{j}^{I} ∣ z_{j}^{I}),

P_{out} (q_{j} ∣ z_{j}) = P (q_{j}^{R} ∣ z_{j}^{R}) P (q_{j}^{I} ∣ z_{j}^{I}),

P (q_{j}^{R} ∣ z_{j}^{R}) = Φ (\frac{2 ( z _{j}^{R} - l ( q _{j}^{R} ))}{σ}) - Φ (\frac{2 ( z _{j}^{R} - u ( q _{j}^{R} ))}{σ})

P (q_{j}^{R} ∣ z_{j}^{R}) = Φ (\frac{2 ( z _{j}^{R} - l ( q _{j}^{R} ))}{σ}) - Φ (\frac{2 ( z _{j}^{R} - u ( q _{j}^{R} ))}{σ})

P (s ∣ q; h^{'}) = \frac{P ( q ∣ s ; h ^{'} ) P ( s )}{P ( q ; h ^{'} )},

P (s ∣ q; h^{'}) = \frac{P ( q ∣ s ; h ^{'} ) P ( s )}{P ( q ; h ^{'} )},

P (q; h^{'}) = \int_{s} P (q ∣ s; h^{'}) P (s) d s .

P (q; h^{'}) = \int_{s} P (q ∣ s; h^{'}) P (s) d s .

P (s) = j = 1 \prod N P (s_{j}),

P (s) = j = 1 \prod N P (s_{j}),

P (s_{j} ∣ q; h^{'}) = \int_{s ∖ s_{j}} P (s ∣ q; h^{'}) d s .

P (s_{j} ∣ q; h^{'}) = \int_{s ∖ s_{j}} P (s ∣ q; h^{'}) d s .

\overset{s}{ˉ}_{j} = E [s_{j} ∣ q; h^{'}] = \int s_{j} P (s_{j} ∣ q; h^{'}) d s_{j} .

\overset{s}{ˉ}_{j} = E [s_{j} ∣ q; h^{'}] = \int s_{j} P (s_{j} ∣ q; h^{'}) d s_{j} .

\overset{s}{^}_{j} = s \in S argmax P (s_{j} ∣ q; h^{'}) .

\overset{s}{^}_{j} = s \in S argmax P (s_{j} ∣ q; h^{'}) .

z_{j, A}^{post} = E [z_{j}^{R} ∣ q_{j}^{R}] + j E [z_{j}^{I} ∣ q_{j}^{I}],

z_{j, A}^{post} = E [z_{j}^{R} ∣ q_{j}^{R}] + j E [z_{j}^{I} ∣ q_{j}^{I}],

v_{j, A}^{post} = var [z_{j}^{R} ∣ q_{j}^{R}] + var [z_{j}^{I} ∣ q_{j}^{I}],

v_{A}^{post} = \frac{1}{N} j = 1 \sum N v_{j, A}^{post},

v_{A}^{post} = \frac{1}{N} j = 1 \sum N v_{j, A}^{post},

v_{B}^{pri} = v_{A}^{ext} = (\frac{1}{v _{A}^{post}} - \frac{1}{v _{A}^{pri}})^{- 1},

x_{B}^{pri} = x_{A}^{ext} = v_{A}^{ext} (\frac{F z _{A}^{post}}{v _{A}^{post}} - \frac{F z _{A}^{pri}}{v _{A}^{pri}}),

s_{j, B}^{post} = E [s_{j} ∣ h_{j}^{'}, x_{j, B}^{pri}],

s_{j, B}^{post} = E [s_{j} ∣ h_{j}^{'}, x_{j, B}^{pri}],

v_{j, B}^{post} = var [s_{j} ∣ h_{j}^{'}, x_{j, B}^{pri}],

x_{j, B}^{post} = h_{j}^{'} s_{j, B}^{post},

x_{j, B}^{post} = h_{j}^{'} s_{j, B}^{post},

v_{B}^{post} = \frac{1}{N} j = 1 \sum N ∣ h_{j}^{'} ∣^{2} v_{j, B}^{post},

v_{A}^{pri} = v_{B}^{ext} = (\frac{1}{v _{B}^{post}} - \frac{1}{v _{B}^{pri}})^{- 1},

z_{A}^{pri} = z_{B}^{ext} = v_{B}^{ext} (\frac{F ^{H} x _{B}^{post}}{v _{B}^{post}} - \frac{F ^{H} x _{B}^{pri}}{v _{B}^{pri}}) .

q = Q_{c} (z + n), \mbox an d z = z_{A}^{pri} + ω_{A},

q = Q_{c} (z + n), \mbox an d z = z_{A}^{pri} + ω_{A},

P (z_{j}^{R} ∣ q_{j}^{R}) = \frac{P ( q _{j}^{R} ∣ z _{j}^{R} ) P ( z _{j}^{R} )}{\int _{- \infty}^{\infty} P ( q _{j}^{R} ∣ z _{j}^{R} ) P ( z _{j}^{R} ) d z _{j}^{R}},

P (z_{j}^{R} ∣ q_{j}^{R}) = \frac{P ( q _{j}^{R} ∣ z _{j}^{R} ) P ( z _{j}^{R} )}{\int _{- \infty}^{\infty} P ( q _{j}^{R} ∣ z _{j}^{R} ) P ( z _{j}^{R} ) d z _{j}^{R}},

E [z_{j}^{R} ∣ q_{j}^{R}] = z_{j, A}^{pri, R} + \frac{v _{A}^{pri}}{2 ( v _{A}^{pri} + σ ^{2} )} (\frac{ϕ ( η _{1} ) - ϕ ( η _{2} )}{Φ ( η _{1} ) - Φ ( η _{2} )}),

E [z_{j}^{R} ∣ q_{j}^{R}] = z_{j, A}^{pri, R} + \frac{v _{A}^{pri}}{2 ( v _{A}^{pri} + σ ^{2} )} (\frac{ϕ ( η _{1} ) - ϕ ( η _{2} )}{Φ ( η _{1} ) - Φ ( η _{2} )}),

var [z_{j}^{R} ∣ q_{j}^{R}] = \frac{v _{A}^{pri}}{2} - \frac{( v _{A}^{pri} ) ^{2}}{2 ( v _{A}^{pri} + σ ^{2} )} \times [(\frac{ϕ ( η _{1} ) - ϕ ( η _{2} )}{Φ ( η _{1} ) - Φ ( η _{2} )})^{2} + \frac{η _{1} ϕ ( η _{1} ) - η _{2} ϕ ( η _{2} )}{Φ ( η _{1} ) - Φ ( η _{2} )}],

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Bayesian Optimal Data Detector for mmWave OFDM System with Low-Resolution ADC

Hanqing Wang, Chao-Kai Wen, and Shi Jin H. Wang and S. Jin are with the National Mobile Communications Research Laboratory, Southeast University, Nanjing 210096, P. R. China. P (e-mail: $\rm [email protected];~{}[email protected]$ ).C.-K. Wen is with the Institute of Communications Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan (e-mail: $\rm [email protected]$ ).Part of this work has been presented at IEEE ICCS 2016 in Shenzhen [1].

Abstract

Orthogonal frequency division multiplexing (OFDM) has been widely used in communication systems operating in the millimeter wave (mmWave) band to combat frequency-selective fading and achieve multi-Gbps transmissions, such as IEEE 802.15.3c and IEEE 802.11ad. For mmWave systems with ultra high sampling rate requirements, the use of low-resolution analog-to-digital converters (ADCs) (i.e., 1–3 bits) ensures an acceptable level of power consumption and system costs. However, orthogonality among subchannels in the OFDM system cannot be maintained because of the severe nonlinearity caused by low-resolution ADC, which renders the design of data detector challenging. In this study, we develop an efficient algorithm for optimal data detection in the mmWave OFDM system with low-resolution ADCs. The analytical performance of the proposed detector is derived and verified to achieve the fundamental limit of the Bayesian optimal design. On the basis of the derived analytical expression, we further propose a power allocation (PA) scheme that seeks to minimize the average symbol error rate. In addition to the optimal data detector, we also develop a feasible channel estimation method, which can provide high-quality channel state information without significant pilot overhead. Simulation results confirm the accuracy of our analysis and illustrate that the performance of the proposed detector in conjunction with the proposed PA scheme is close to the optimal performance of the OFDM system with infinite-resolution ADC.

Index Terms:

Low-resolution ADC, mmWave, OFDM, data detection, channel estimation, power allocation, Bayesian inference, replica method.

I Introduction

Millimeter wave (mmWave) communications utilize the spectrum range of 30 GHz to 300 GHz, where a large bandwidth is available, to achieve ultra high data rates [2]. Large-scale applications operating in the mmWave band are emerging, such as wireless local and personal area network systems [3, 4], 5G cellular systems [5], vehicular communications [6], and wearables [7], because of this high rate supporting potential and the severe shortage of spectrum resource available in the sub-6 GHz bands.

Despite the potential advantage of high data rates, mmWave communications demand very high sampling frequencies on analog-to-digital converters (ADCs), where received analog signals are converted into digital signals for subsequent signal processing. Unfortunately, the power consumption of an ADC unit increases quadratically with the sampling frequency and exponentially with the number of quantization bits at a sampling rate above 100 MSps [8, 9]. Applying high speed (e.g., several GSps) and high precision (e.g., above 6 bits) ADCs at the mmWave receiver shall result in prohibitively high power consumption and system costs, particularly in mobile devices. This issue is among the key bottlenecks in achieving mmWave systems. A potential direction to pursue is the use of very-low-resolution ADCs (e.g., 1–3 bits111Current wireless communication systems typically equip 8–12 bit ADCs at their receivers.) aligned with advanced signal processing techniques to mitigate the sacrifice in overall system performance [10]. Several aspects of this direction have been investigated in the literatures, including capacity analysis and capacity-achieving strategy for single-input single-output (SISO) channel [11, 12, 13] and multiple-input-multiple-output (MIMO) channel [14, 15, 16], data detection for the MIMO system under frequency-flat channel [17, 18, 19, 20, 21] and frequency-selective channel [22, 23, 24, 25], and channel estimation [21, 26, 25, 27, 28].

Meanwhile, the signal transmitted over the mmWave channel, where the bandwidth is much wider than the coherence bandwidth, generally suffers from severe frequency-selective fading, which gives rise to serious inter-symbol interference (ISI). By adding a cyclic prefix (CP) for converting linear convolution into circular convolution and using the discrete Fourier transform (DFT), orthogonal frequency division multiplexing (OFDM) technology decomposes the ISI channel into a set of orthogonal subchannels with a bandwidth smaller than the coherence bandwidth [29].

Consequently, the OFDM technology has been widely used in various wideband wireless communication systems to combat ISI caused by the frequency-selective fading. In the mmWave range, standard systems, such as IEEE 802.11ad [3] and IEEE 802.15.3c [4], operate in the 60 GHz band and use the OFDM technique to achieve data rates of up to multiple Gbps.

In this study, we focus on OFDM systems with low-resolution ADCs at the receiver. We refer to such systems as quantized OFDM (Q-OFDM) systems. The coarse quantization in the OFDM system causes strong nonlinear distortion on the received signals, such that the orthogonality among subchannels cannot be maintained in the Q-OFDM system and severe inter-carrier interference (ICI) occurs. These issues render the design of data detection algorithms challenging because the simple one-tap equalizer used in conventional OFDM receivers can no longer perform well. A traditional heuristic approximates the effect of hardware imperfections by using a linear model [30]. These imperfections include phase-drifts, distortion noise, and amplified thermal noise. The additive quantization noise model (AQNM), which assumes that quantization noise is additive and independent, is a representative model of this method. This linear approximation facilitates the analysis of spectral efficiency and energy efficiency for systems with low-resolution ADCs, especially for massive MIMO systems [31, 32, 33]. Therefore, AQNM generates additional insights into system design perspective, such as the optimal number of base station (BS) antennas [32] as well as optimal pilot length [33] and ADC resolution [34]. However, AQNM cannot provide satisfactory approximation in the Q-OFDM system because this model completely ignores the ICI effect caused by the coarse ADC. Data detection based on the AQNM leads to significant performance loss, which will be confirmed by simulation results.

Although various studies on data detection problems, such as [17, 18, 19, 20, 21], have considered the exact quantization model, they are all dedicated to the data detection for general MIMO channels rather than for the Q-OFDM channels. From the statistical inference perspective, very little difference exists between the Q-OFDM channel and the quantize MIMO channel in terms of data detection, which both involve inferring a random vector observed through a linear transformation followed by a nonlinear measurement channel. However, the linear transformation matrix in the OFDM channel is orthogonal, whereas that in the quantize MIMO channel is independent and identically distributed (i.i.d.) random.

Furthermore, data detection algorithms proposed for the wideband channel [23, 24, 25] are also sub-optimal for the Q-OFDM system. The fast adaptive shrinkage/thresholding algorithm used in [24] assumes that the transmitted symbols are drawn from a complex Gaussian distribution, which is not optimal for the detection of modulated signals. In [25], an efficient data detection algorithm based on the generalized approximate message passing (GAMP) algorithm [35] was proposed. GAMP is the most representative (and state-of-the-art) approach for the estimation of a random vector observed through a linear transformation followed by a componentwise, nonlinear measurement channel. However, GAMP has been proven optimal for i.i.d. waveforms only and not for the orthogonal waveform of our interest. Moreover, the performance analysis of the GAMP-based detector is not available for the orthogonal waveform. Therefore, performing time-consuming Monte-Carlo simulations to evaluate the GAMP-based detector for the Q-OFDM system is inevitable. Recent works in [36, 37, 38] revealed that the optimal inference for i.i.d. transform matrices yields worse performance for sparse signal recovery problems with orthogonal transform matrices. Therefore, the detection performance under the Q-OFDM channel may be underestimated when employing the existing algorithms.

Thus far, the solution on how to achieve the best data detection performance for the Q-OFDM system is generally unknown. This study takes the first step toward this direction. Specifically, we propose an optimal, computationally tractable data detector based on the Turbo iteration principle proposed in [38] and derive its corresponding state evolution (SE) equations. The uniqueness of this work is summarized as follows:

•

Optimality. The SE equations of the proposed detector can match those of the Bayesian optimal detector derived via the replica theory. This indicates that the proposed detector can attain the optimal detection performance. Importantly, in contrast with direct computation of the Bayesian optimal solution, the proposed detector is computationally tractable. The symbol error rate (SER) of the proposed detector provides the lower bound for various detectors for the Q-OFDM system, which can be served as a benchmark for algorithm design and a foundation for evaluating the feasibility of utilizing low resolution ADCs in practical systems. We demonstrate through simulations that the proposed detector achieves better performance than the most representative GAMP-based detectors without any increase in computational complexity.

•

Theoretical Analysability. The SE analysis of the proposed algorithm is available. With SE analysis, performance metrics, such as the average SER, can be analytically determined without using time-consuming Monte Carlo simulations. Notably, the SE analysis demonstrates a decoupling principle, that is, the input-output relationship of the proposed detector on each subchannel can be decoupled into a bank of equivalent additive white Gaussian noise (AWGN) channels. The decoupling principle enables the development of a power allocation (PA) algorithm to minimize the average SER across these equivalent AWGN channels. The simulations show that this PA scheme improves the SER performance significantly compared with the equal subchannel PA (ESPA).

•

Flexibility. The principle underlying the proposed detector provides a unified framework for solving a variety of detection and estimation problems. Under this unified framework, we also develop a feasible method for channel estimation to apply the proposed Q-OFDM detector to a practical scenario without the perfect CSI. The simulation results show that precise CSI can be acquired through the proposed scheme.

Notations. This paper uses lowercase and uppercase boldface letters to represent vectors and matrices, respectively. For vector $\mathbf{a}$ , the operator $\mathrm{diag}(\mathbf{a})$ denotes the diagonal matrix with diagonal elements as the $\mathbf{a}$ entries. Moreover, the real and imaginary parts of a complex scalar $a$ are represented by $a^{R}$ and $a^{I}$ , respectively. The distribution of a proper complex Gaussian random variable $z$ with mean $\mu$ and variance $\nu$ is expressed as

[TABLE]

Similarly, $\mathcal{N}(z;\mu,\nu)$ denotes the probability density function (PDF) of a real Gaussian random variable $z$ with mean $\mu$ and variance $\nu$ . We let $\mathrm{D}z$ denote the real Gaussian integration measure

[TABLE]

The cumulative Gaussian distribution function is defined as $\Phi(z)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{z}e^{-\frac{t^{2}}{2}}\mathrm{d}t$ , and the Q-function is defined as $Q(z)=1-\Phi(z)$ .

II System Model

We consider the OFDM system with $N$ orthogonal subchannels. We let $\mathbf{s}=[s_{1},s_{2},\cdots,s_{N}]^{T}\in\mathcal{S}^{N}$ denote the input block to be transmitted in each subchannel, where $\mathcal{S}$ denotes the set of constellation points of the chosen modulation method, such as quadrature phase shift keying (QPSK) or quadrature amplitude modulation (QAM). We allocate power $p_{j}$ to the $j$ -th subchannel while keep the total power of the entire OFDM symbol constant to optimize some performance metrics, such as SER. Specifically, the symbol in the $j$ -th subchannel is multiplied by the scalar coefficient $\sqrt{p_{j}}$ and $\sum_{j=1}^{N}p_{j}=N\bar{P}$ , where $\bar{P}$ is the average power per subchannel available in the transmitter. Then, we define a new diagonal matrix $\mathbf{P}=\mathrm{diag}(p_{1},p_{2},\cdots,p_{N})$ . The frequency-domain block $\mathbf{P}^{\frac{1}{2}}\mathbf{s}$ is transformed to the time domain by the $N$ -point inverse DFT written as $\mathbf{F}^{H}\mathbf{P}^{\frac{1}{2}}\mathbf{s}$ , where $\mathbf{F}$ denotes the normalized DFT matrix whose $(m,n)$ -th entry is $\frac{1}{{\sqrt{N}}}e^{-{2\pi j(n-1)(m-1)}/N}$ .

The transmitted signal is filtered by a multipath channel, which can be represented by a tapped delay line model with $L$ taps. We let $g_{i}$ denote the discrete-time impulse response of the $i$ -th tag. The last $L_{\rm cp}$ $(L_{\rm cp}\geq L)$ time domain samples are appended as a CP at the beginning of each OFDM symbol before transmitting it over the channel to avoid the ISI caused by the multipath channel. At the receiver, the analog signal is discretized after down-converting the received signal into the analog baseband. After CP removal, the (unquantized) received block of OFDM symbol can be written as

[TABLE]

where $\mathbf{G}\in\mathcal{C}^{N\times N}$ is the circulant matrix with $\mathbf{g}=[g_{1},\,g_{2},\cdots,g_{N}]^{T}$ being its first column and $g_{j}=0$ for $(L+1)\leq j\leq N$ , and $\mathbf{n}$ is the AWGN vector with zero mean and covariance matrix $\sigma^{2}\mathbf{I}$ . The circulant matrix $\mathbf{G}$ can be decomposed as

[TABLE]

where $\mathbf{h}$ denotes the frequency-domain channel presentation obtained by operating DFT on the first column of $\mathbf{G}$ , that is, $\mathbf{h}=\mathbf{F}\mathbf{g}$ . Substituting (2) into (1), we can rewrite (1) as

[TABLE]

where $\mathbf{h}^{\prime}$ denotes the channel vector comprised of the diagonal entries of the matrix $\mathrm{diag}(\mathbf{h})\mathbf{P}^{\frac{1}{2}}$ , that is, $\mathbf{h}^{\prime}=[\sqrt{p_{1}}h_{1},\sqrt{p_{2}}h_{2},\cdots,\sqrt{p_{N}}h_{N}]^{T}$ .

Each element $y_{j}$ of the received signal $\mathbf{y}$ is quantized using a complex-valued quantizer $\mathcal{Q}_{c}(\cdot)$ , which consists of two real-valued quantizers $\mathcal{Q}(\cdot)$ that quantize the real and imaginary parts of $y_{j}$ separately and independently, that is,

[TABLE]

We consider $\mathcal{Q}(\cdot)$ as a $B$ -bit quantizer, which maps the real-valued input $y^{R}_{j}$ or $y^{I}_{j}$ to one of the $2^{B}$ discrete values. The output is specifically assigned the value $c_{b}$ , that is, the $b$ -th discrete value, when the quantizer input is within the interval $(r_{b-1},r_{b}]$ , where $-\infty=r_{0}<r_{1}<\cdots<r_{2^{B-1}}<r_{2^{B}}=\infty$ are the thresholds. We take $c_{b}$ as the centroid of the interval $(r_{b-1},r_{b}]$ . The quantized received signal can be denoted as

[TABLE]

Data detection aims to recover the transmitted symbol $\mathbf{s}$ from the quantized signal $\mathbf{q}$ given by the linear mixing model (5) with linear transformation matrix $\mathbf{F}^{H}$ . The conventional OFDM receiver performs DFT directly on the quantized signal $\mathbf{q}$ and yields $\tilde{\mathbf{q}}=\mathbf{F}\mathbf{q}$ . The decision rule follows the one-tap equalizer given by

[TABLE]

For infinite-precision quantization, that is, $\mathbf{q}=\mathbf{y}$ , the DFT operation on $\mathbf{q}$ enables the signal at each subchannel to be an AWGN observation of the product of the transmitted symbol and its corresponding frequency-domain channel response. Therefore, (6) is the optimal decision rule based on the maximum likelihood (ML) criteria. However, this conventional OFDM detector, which employs a one-tap equalizer, is no longer optimal for the low-resolution quantization case in which the orthogonality among subchannels is not preserved.

Remark 1

Beamforming techniques operating in the RF domain shall be used at the transmitter and receiver to overcome the high propagation loss in mmWave band. Markedly, (3) is a concise equivalent representation for the input-output relationship of the mmWave OFDM system using analog transmitter and receiver beamforming with one transmitted and one received data stream as depicted in [2, Fig. 2]. Specifically, in this system, each element of $\mathbf{h}$ is expressed as [39]

[TABLE]

where $\mathbf{w}^{\rm RX}\in\mathcal{C}^{1\times N_{R}}$ and $\mathbf{w}^{\rm TX}\in\mathcal{C}^{N_{T}\times 1}$ are beamforming vectors at the receiver and transmitter, respectively; $\mathbf{H}_{j}\in\mathcal{C}^{N_{R}\times N_{T}}$ represents the channel response matrix at the $j$ -th subchannel; and $N_{T}$ and $N_{R}$ are the number of transmit and receive antennas, respectively. $\blacksquare$

Remark 2

As the key technology for the next generation mobile communications, mmWave communications aligned with large-scale antenna array are definitely exploited for multi-stream and multi-user scenarios. Although designed for single stream problem, the proposed algorithm can be also employed by the receiver of uplink transmission of the cellular systems (i.e., the BS). For uplink transmission, the analog beamforming is implemented for the spatial division of different users. In addition, narrow beam is steered by the analog beamforming to form high-directional spatial links between different users and the BS. Therefore, following proper user selection, the entire uplink transmission can be approximately decomposed into several parallel single-stream communications, and the proposed detector can be employed for the optimal detection at the BS side of each individual spatial link. The proposed detector is advantageous considering that digital beamforming can be further applied to multiuser interference mitigation. However, the topic is beyond the scope of this paper and thus left for future work. $\blacksquare$

III Optimal Data Detection

In this section, we explain the theoretical foundation for Bayesian inference and introduce the data detection algorithm. We first assume that the perfect channel state information at the receiver (CSIR) $\mathbf{h}^{\prime}$ is available to elucidate the concept. The performance analysis, the PA scheme, and the channel estimation method will be introduced in Section IV and V.

III-A Theoretical Foundation

Before proceeding, we define two auxiliary vectors

[TABLE]

to facilitate our subsequent discussion. And we specify the likelihood function, which plays a key role in Bayesian inference. With the perfect CSIR $\mathbf{h}^{\prime}$ , the likelihood function is the distribution of the quantized signal $\mathbf{q}$ conditioned on the transmitted vector $\mathbf{s}$ . From (5), it can be given by

[TABLE]

The factorization of $\mathrm{P}(\mathbf{q}\mid\mathbf{s};\mathbf{h}^{\prime})$ is derived from the fact that from (5), the value of $q_{i}$ given $z_{i}$ depends only on $n_{i}$ , and the elements of AWGN vector $\mathbf{n}$ are statistically independent. According to the property of the complex-valued quantizer (4), we derive that

[TABLE]

where $\mathrm{P}(q^{R}_{j}\mid z^{R}_{j})$ denotes the probability of observing the real part quantized output $q^{R}_{j}$ given the real part of noiseless unquantized received signal $z^{R}_{j}$ . Specifically,

[TABLE]

where $l(q^{R}_{j})$ and $u(q^{R}_{j})$ denote the corresponding lower and upper bounds of the quantizer output value $q^{R}_{j}$ . For example, when $q^{R}_{j}=c_{b}$ , $l(q^{R}_{j})=r_{b-1}$ and $u(q^{R}_{j})=r_{b}$ . The corresponding probability for the imaginary part $\mathrm{P}(q^{I}_{j}\mid z^{I}_{j})$ can be given analogously.

According to the Bayesian rule, the posterior probability can be obtained by

[TABLE]

where $\mathrm{P}(\mathbf{q}\mid\mathbf{s};\mathbf{h}^{\prime})$ is the likelihood function defined in (9), $\mathrm{P}(\mathbf{s})$ is the prior distribution, and $\mathrm{P}(\mathbf{q};\mathbf{h}^{\prime})$ is the marginal distribution computed by

[TABLE]

In this paper, we consider that the elements of $\mathbf{s}$ are i.i.d., therefore

[TABLE]

and $s_{j}$ ’s are drawn from a set of constellation points with equal probabilities, thus $\mathrm{P}(s_{j})=1/|\mathcal{S}|$ for $s_{j}\in\mathcal{S}$ .

Using the posterior probability (12), the marginal posterior probability can be obtained via

[TABLE]

The posterior mean achieves the minimum mean-square error (MMSE), and its $j$ -th element can be expressed as:

[TABLE]

Moreover, the widely used maximum a posterior (MAP) decision rule is given by

[TABLE]

The Bayesian MMSE estimation (16) and MAP inference (17) are computationally intractable in this case because the calculation of marginal posterior probability in (15) involves the high-dimensional integral. We resort to a recently developed approximation technique called the generalized Turbo (GTurbo) principle [38] to calculate the posterior mean (16) iteratively. We demonstrate the adoption of the GTurbo principle for data detection in the subsequent subsection.

Remark 3

The posterior probability (12) together with the likelihood (9) and the prior (14) can be represented as a graphical model [40] with the elements of $\mathbf{s}$ and $\mathbf{q}$ being its variable nodes and factor nodes respectively. Belief propagation (BP) is a typical technique for calculating marginal distributions and can often provide good approximations for margins on sparse graphical models. However, (5) corresponds to a dense graphical model where each factor node interacts with all variable nodes because of the linear transformation $\mathbf{F}^{H}$ . GAMP [35] is an approximate version of BP that emerges recently and demonstrates good performance in dense graphical models. A closely related work [25] investigates the same data detection problem as in this study using GAMP. However, GAMP was proven to yield the optimal solutions to (16) and (17) only if the entries of linear transformation matrix of the linear mixing model (5) are independent. The superiority of the proposed algorithm based on the GTurbo principle over the existing algorithms will be shown through simulation results. $\blacksquare$

III-B GTurbo-based Algorithm

The GTurbo-based data detection algorithm for the Q-OFDM system is presented in Algorithm 1, and the corresponding block diagram is illustrated in Fig. 1. This algorithm comprises of two modules: Module A produces the direct coarse estimation of $\mathbf{x}$ from the relationship ${{\mathbf{x}}={\mathbf{F}}{\mathbf{z}}}$ in (8) without considering prior $\mathrm{P}(\mathbf{s})$ , whereas Module B refines the estimate by considering prior $\mathrm{P}(\mathbf{s})$ . The two modules are executed iteratively until convergence.

We provide a number of detailed explanations for Algorithm 1. In Module A, $\mathbf{z}_{A}^{{\rm post}}$ can be viewed as the Bayesian MMSE estimation of $\mathbf{z}$ from the relationship

[TABLE]

where $\boldsymbol{\omega}_{A}\sim\mathcal{CN}(\mathbf{0},v_{A}^{{\rm pri}}\bf{I})$ . Specifically, (18a) and (18b) compute the posteriori mean and variance of $z_{j}$ respectively, given its corresponding quantized observation $q_{j}$ , where $\mathrm{E}\left[z^{R}_{j}\mid q^{R}_{j}\right]$ and $\mathrm{var}\left[z^{R}_{j}\mid q^{R}_{j}\right]$ denote the expectation and variance of $z^{R}_{j}$ with respect to (w.r.t.) the posterior probability

[TABLE]

where $\mathrm{P}(q^{R}_{j}\mid z^{R}_{j})$ is given by (11), and $\mathrm{P}(z^{R}_{j})=\mathcal{N}(z^{R}_{j};z_{j,A}^{{\rm pri},R},\frac{1}{2}v_{A}^{{\rm pri}})$ for the given $v_{A}^{{\rm pri}}$ and $\mathbf{z}_{A}^{\rm pri}$ under the assumption (22). Following the derivation of [21, Appendix A], the explicit expressions of the posteriori mean and variance of $z^{R}_{j}$ given $q^{R}_{j}$ can be obtained by

[TABLE]

where

[TABLE]

Furthermore, $\mathrm{E}\left[z^{I}_{j}\mid q^{I}_{j}\right]$ and $\mathrm{var}\left[z^{I}_{j}\mid q^{I}_{j}\right]$ can be computed analogously by replacing $z_{j,A}^{{\rm pri},R}$ with $z_{j,A}^{{\rm pri},I}$ in the computation for $\eta_{1}$ and $\eta_{2}$ in (24).

From (8), we derive ${{\mathbf{x}}={\mathbf{F}}{\mathbf{z}}}$ . Therefore, the posteriori mean and variance of ${\mathbf{x}}$ can be computed by ${\mathbf{F}}\mathbf{z}_{A}^{\rm post}$ and ${\mathbf{F}}\mathrm{diag}(v_{1,A}^{\rm post},v_{2,A}^{\rm post},\cdots,v_{N,A}^{\rm post}){\mathbf{F}}^{H}$ , respectively. To reduce the computational complexity, we replace $\mathrm{diag}(v_{1,A}^{\rm post},v_{2,A}^{\rm post},\cdots,v_{N,A}^{\rm post})$ with $(\frac{1}{N}\sum_{j=1}^{N}v_{j,A}^{\rm post})\mathrm{\mathbf{I}}$ as in (19a). Subsequently, the extrinsic mean and variance of $\mathbf{x}$ are computed by (19b) and (19c) similar to the concise formulas in [41, (14) and (15)], which are then used as the inputs $v_{B}^{{\rm pri}}$ and $\mathbf{x}_{B}^{{\rm pri}}$ of Module B. Therefore, Module A produces an estimate of $(\mathbf{x},\mathbf{z})$ in which $\mathbf{x}$ is estimated through the linear relation (8) without considering prior $\mathrm{P}(\mathbf{s})$ , whereas $\mathbf{z}$ is the Bayesian MMSE estimation by considering the likelihood $\mathrm{P}(\mathbf{q}\mid\mathbf{z})$ .

Subsequently, we turn to the MMSE estimation of $\mathbf{s}$ processed in Module B. Initially, $\mathbf{x}_{B}^{{\rm pri}}$ is assumed as an AWGN observation of $\mathbf{x}=\mathrm{diag}(\mathbf{h}^{\prime})\mathbf{s}$ , that is,

[TABLE]

where $\boldsymbol{\omega}_{B}\sim\mathcal{CN}(\mathbf{0},v_{B}^{{\rm pri}}\bf{I})$ . Using the aforementioned assumption and the given frequency-domain channel response $\mathbf{h}^{\prime}$ , we compute the posteriori mean and variance of $\mathbf{s}$ in (20a) and (20b) taken w.r.t. the posterior probability distribution

[TABLE]

Consequently, the explicit expressions of $s_{j,B}^{\rm post}$ and $v_{j,B}^{\rm post}$ can be derived as

[TABLE]

Similar to those in (19), $\mathbf{z}_{B}^{{\rm post}}$ is estimated directly based on the relationship ${{\mathbf{z}}={\mathbf{F}}^{H}{\mathbf{x}}}$ . Then, the extrinsic mean and variance of $\mathbf{z}$ are evaluated in (21c) and (21d), respectively. Therefore, Module B produces an estimate of $(\mathbf{s},\mathbf{z})$ in which $\mathbf{s}$ is the Bayesian MMSE estimation by considering prior $\mathrm{P}(\mathbf{s})$ , whereas $\mathbf{z}$ is estimated through the linear relation (8) without considering the likelihood $\mathrm{P}(\mathbf{q}\mid\mathbf{z})$ .

Algorithm 1 aims to calculate the marginal posterior probability in an iterative manner. After the convergence of the iteration, we obtain the estimated marginal posterior probability $\mathrm{P}(s_{j}\mid x_{j,B}^{{\rm pri}};h^{\prime}_{j})=\mathcal{CN}(s_{j};s_{j,B}^{\rm post},v_{j,B}^{\rm post})$ . Thus, the posterior mean in (16) is obtained as $s_{j,B}^{\rm post}$ , and the MAP inference in (17) is equivalent to find $s\in\mathcal{S}$ with the shortest distance to $s_{j,B}^{\rm post}$ , that is,

[TABLE]

IV State Evolution and Power Allocation

The asymptotic performance of the proposed algorithm can be characterized by the recursion of a set of SE equations [38]. We derive these equations in the large-system limit where $N\to\infty$ in Section IV-A. Subsequently, we show the decoupling principle and develop a subchannel power allocation scheme to minimize the SER in Section IV-B. Finally in Section IV-C, we analyze the complexity of the proposed algorithms.

IV-A State Evolution

From the explanations introduced in Section III-B, we observe that the performance of the detector is determined by $v_{B}^{{\rm pri}}$ , which can be viewed as the average noise power of the equivalent AWGN channels in (25). In addition, $v_{A}^{{\rm pri}}$ and $v_{B}^{{\rm pri}}$ are mutually dependent in a recursive manner as shown in (19b) and (21c), respectively. Therefore, we define the following two states to characterize the performance of the detector:

[TABLE]

In addition, we define the MMSE of $s$ given its AWGN observation $r=s+\omega$ as

[TABLE]

where $\omega\sim\mathcal{CN}(0,\eta^{-1})$ , the outer expectation is taken w.r.t. the distribution $\mathrm{P}(s)$ , whereas the inner expectation is taken w.r.t. the marginal distribution $\int\mathrm{P}(r|s)\mathrm{P}(s)\mathrm{d}s$ . For example, if $s$ is drawn from the equiprobable QPSK constellation, then $\mathrm{mmse}(\eta)$ can be derived as [21]

[TABLE]

By evaluating the two states in the large-system limit, Proposition 1 can be derived. The calculation details are provided in Appendix A.

Proposition 1

In the large-system limit, the SE of Algorithm 1 can be characterized by

[TABLE]

where $t$ denotes the iteration index, the initialization $\upsilon^{0}=v_{x}\triangleq\frac{1}{N}\sum_{j=1}^{N}|h_{j}|^{2}p_{j}$ , and

[TABLE]

$\blacksquare$ **

Remark 4

In the OFDM system with infinite-precision quantization, parallel data are transmitted over $N$ mutually orthogonal subchannels. The signal-to-noise ratio (SNR) of the $j$ -th subchannel is $\frac{p_{j}|h_{j}|^{2}}{\sigma^{2}}$ . However, the orthogonality among subchannels in the Q-OFDM system cannot be maintained. Proposition 1 in conjunction with (25) reveals that, in the large-system limit, the input-output relationship of the Q-OFDM system employing Algorithm 1 can still be decoupled into a bank of equivalent AWGN channels corresponding to $N$ subchannels given by

[TABLE]

for $j=1,\cdots,N$ , where $w_{j}\sim{\cal CN}(0,1)$ . We refer to this characteristic as the decoupling principle. The SNR of the equivalent AWGN channel is $p_{j}|h_{j}|^{2}\eta^{t}$ . $\blacksquare$

As $B\to\infty$ , (5) is reduced to the OFDM system with infinite-precision quantization. Let $r_{b-1}=r$ and $r_{b}=r_{b-1}+{\rm d}r$ . As $B\to\infty$ , we obtain ${\rm d}r\to 0$ , which results in $\Phi\left(\frac{z-r_{b-1}}{u}\right)-\Phi\left(\frac{z-r_{b}}{u}\right)\to\frac{\rm d}{{\rm d}r}\Phi\left(\frac{z-r}{u}\right)$ and $\phi\left(\frac{z-r_{b-1}}{u}\right)-\phi\left(\frac{z-r_{b}}{u}\right)\to\frac{\rm d}{{\rm d}r}\phi\left(\frac{z-r}{u}\right)$ . By substituting these relationships into (31a) and applying the facts that $\frac{\rm d}{{\rm d}r}\Phi\left(\frac{z-r}{u}\right)=\frac{1}{u}\phi\left(\frac{z-r}{u}\right)$ and $\frac{\rm d}{{\rm d}r}\phi\left(\frac{z-r}{u}\right)=\left(\frac{z-r}{u^{2}}\right)\phi\left(\frac{z-r}{u}\right)$ , we can obtain

[TABLE]

Substituting (33) into (31b), we obtain $\eta^{t}=1/\sigma^{2}$ for any iteration index $t$ . The resulting SNR is perfectly consistent with that in the infinite-precision OFDM system. Consequently, the parameter $1/\eta^{t}$ can be served as an equivalent noise power of the Q-OFDM system, and $\eta^{t}\leq 1/\sigma^{2}$ .

With the decoupling principle, we can easily predict several fundamental performance metrics, such as MSE, SER, and mutual information, of the Q-OFDM system without performing time-consuming Monte Carlo simulations. For example, we determine that $\mathrm{mmse}(|h^{\prime}_{j}|^{2}\eta^{t})$ predicts the per-component MSE of $\mathbf{s}$ at the $t$ -th iteration. If the data symbol is drawn from the $M$ -QAM constellation, then the SER at the $t$ -th iteration can be obtained analytically by [29]

[TABLE]

where $g_{M}=\frac{3}{M-1}$ . Clearly, the decoupling principle and the SE equations are useful for performance optimization. For example, the decoupling principle facilitates the allocation of power among $N$ subchannels to optimize some performance metrics, which will be discussed in the subsequent subsection.

Remark 5

The argument from statistical mechanics (see, e.g., [42, 43]) shows that the performance metrics of the Bayesian MMSE estimator, such as the MSE of $\mathbf{s}$ , correspond to the saddle points of the average free entropy, which is defined as

[TABLE]

where the expectation is taken w.r.t. the marginal likelihood in (13). For a review of the statistical mechanics methods applied to high-dimensional inference, please refer to [44]. The calculation of $\mathcal{F}$ and its saddle points are given in Appendix B. The saddle points of $\mathcal{F}$ expressed in (56a)–(56d) in Appendix B are identical to those of the SE equations (31a)–(31c), by substituting $\frac{1}{\chi_{s}}=\nu$ and $\tilde{q}_{s}=\eta$ into (56a)–(56d). This result indicates that Algorithm 1 can yield the same estimate as direct integration in (16) as the Bayesian MMSE estimator does. $\blacksquare$

IV-B Power Allocation

In a frequency-selective fading channel, the channel gains among different subchannels widely vary. Under the low-precision quantization scenario, data sent from the weaker subchannels tend to be lost because of strong ICI from the stronger subchannels, which leads to a high error floor. In this subsection, we develop a PA scheme to further improve the SER performance.

Recall from Remark 4 that the input-output relationship of the Q-OFDM system can be decomposed into a bank of AWGN channels corresponding to $N$ subchannels with SNR $p_{j}|h_{j}|^{2}\eta$ for $j=1,\cdots,N$ . With this decoupling principle, we can allocate the total power $\sum_{j=1}^{N}p_{j}=N\bar{P}$ among the $N$ equivalent AWGN channels to optimize some performance metrics. In particular, we consider the subchannel power allocation that minimizes the SER. From [45, Proposition 1], the SER for a $M$ -QAM OFDM system under a given channel realization $\{h_{j}\}$ and noise power $\eta^{-1}$ is given by

[TABLE]

where $O(\cdot)$ is the big O notation and

[TABLE]

The SER expression is dominated by the first term which is found to be a good approximation [45]. Therefore, our goal is to derive the optimal PA $\{p_{j}\}_{j=1}^{N}$ that minimizes the dominant term in (37) under the constraint $\sum_{j=1}^{N}p_{j}=N\bar{P}$ . Hereinafter we set $\bar{P}=1$ to simplify the set of simulation parameters. When $\bar{P}=1$ , the parameter $\sigma^{2}$ can be set as the reciprocal of target SNR, and thus normalizing channel gain $g_{i}$ s is easier. However, solving the above problem directly involves an iterative procedure to obtain the solution of $N$ nonlinear equations, which suffers from slow convergence and high computational complexity [46]. Thus we resort to an approximation for the Q-function given by $Q(x)\approx\frac{1}{2}\exp\left(-\frac{x^{2}}{2}\right)$ [29]. Accordingly, we formulate the PA problem as

[TABLE]

Define the Lagrangian function as

[TABLE]

where $\lambda$ is the Lagrange multiplier. Equating the partial derivatives of ${\cal L}$ w.r.t. $\{p_{j}\}_{j=1}^{N}$ to zero, we obtain222In fact, $\lambda$ in (40) is not identical to that in (39). We hope this slight abuse of notation will cause no confusion.

[TABLE]

where $(x)^{+}\triangleq\max\{x,0\}$ , $\gamma=\frac{g_{M}\eta}{2}$ , and $\lambda$ is the parameter selected to satisfy the constraint $\sum_{j=1}^{N}p_{j}=N$ . This PA is called as the approximate minimum symbol error rate (AMSER) scheme. We develop a process that resembles water filling to determine $\{p_{j}\}_{j=1}^{N}$ and $\lambda$ , as expressed in (42) and (43) in Algorithm 2. We let $\mathcal{K}$ be the set of subchannel indices with non-zero power with initialization $\{1,2,\cdots,N\}$ . For a given $\mathcal{K}$ , $\lambda$ can be computed with (42a). If $\min_{j\in\mathcal{K}}\ln|h_{j}|^{2}\geq-\lambda$ is satisfied, then the process is terminated. Otherwise, we remove the subchannel $j_{0}=\arg\min_{j\in\mathcal{K}}\ln|h_{j}|^{2}$ from $\mathcal{K}$ and repeat the process.

Notably, $\eta$ cannot be directly determined before the PA process because $\eta$ is a function of the allocated power $\{p_{j}\}_{j=1}^{N}$ . Therefore, we embed AMSER PA process into the iteration of SE equations and obtain Algorithm 2. Specifically, in the $t$ -th iteration, we compute $\eta^{t}$ through (41a) to (41c) based on the allocated power in the $(t-1)$ -th iteration $\{p^{t-1}_{j}\}_{j=1}^{N}$ . Then, with the fixed $\eta^{t}$ , we obtain the power $\{p^{t}_{j}\}_{j=1}^{N}$ in (42) and (43) and $\nu^{t}$ in (44). The algorithm requires adapting $\eta^{t}$ and $\{p^{t}_{j}\}_{j=1}^{N}$ separately in an iterative manner.

As $B\to\infty$ , we obtain the parallel channels (32) with $\eta^{t}=1/\sigma^{2}$ by the argument following Remark 3. In this case, Algorithm 2 is reduced to the AMSER PA performed for $N$ subchannels of the OFDM system with infinite-precision quantization as proposed in [46].

IV-C Computational Complexity

The computational complexity of the GTurbo-based detector, i.e., Algorithm 1, is dominated by matrix multiplications in (19c) and (21d). Fortunately, they can be implemented with fast Fourier transform (FFT) processors with computational complexity $\mathcal{O}(N\log_{2}N)$ . The detector also converges within a few iterations, as discussed later in Section VI. The real bottleneck of the detector implementation comes from the computation of $\Phi(x)=1-Q(x)$ , which requires deriving the integral of Gaussian function in (18a) and (18b). The hardware-friendly approximation for $Q(x)$ and the pipelined and folding hardware architecture of a GTurbo algorithm have been proposed in [47]. The simulation results in [47] further demonstrate that the fixed-point setting combined with the Q-function approximation only introduce slight performance degeneration to the original floating-point simulation. This complexity analysis is also valid for the channel estimation algorithm presented in the next section.

Algorithm 2 is for power allocation. The maximum possible number of inner iteration is $N$ . Through extensive simulation, we find that Algorithm 2 typically converges within 10 outer iterations. Therefore, the computational complexity of Algorithm 2 is $\mathcal{O}(N)$ . Furthermore, the integral operation in (41b) and (44) can be generally acquired using a look-up table (LUT). Consequently, the two algorithms are computational efficiently and hardware friendly.

V Channel Estimation

In this section, we develop a pilot-based channel estimation approach to obtain the CSI based on the GTurbo framework in Algorithm 1. The pilot sequences are known at the transmitter and receiver sides. In this study, we employ the comb-type pilot arrangement, as shown in Fig. 2, in which the pilot signals are uniformly inserted into the subchannels of an OFDM symbol.

We denote the interval of adjacent subchannels containing pilot signals by $S_{f}$ . We use $\mathcal{X}=\{1,2,\cdots,N\}$ to denote the index set of all subchannels, and we use $\mathcal{X}_{p}\subseteq\mathcal{X}$ and $\mathcal{X}_{d}\subseteq\mathcal{X}$ to denote the index subset of the subchannels containing pilot and data symbols, respectively. The pilots are transmitted periodically every $S_{t}$ OFDM symbols. During each interval of $S_{t}$ OFDM symbols, only one OFDM symbol contains the pilot signals (called the pilot OFDM symbol), whereas the other $S_{t}-1$ OFDM symbols are dedicated to data transmission. Notably, the pilots are contaminated by the data subchannels because of the use of the coarse quantization which results in severe ICI. The conventional pilot-based channel estimation schemes for OFDM systems do not consider this effect and thus cannot work well.

Algorithm 3 is designed only for pilot OFDM symbols to output the estimated channel $\hat{\mathbf{h}}$ and data $\hat{\mathbf{s}}$ . The estimated channel in the pilot OFDM symbol is subsequently utilized as the CSI for data detection by applying Algorithm 1 to the remainder of the OFDM symbols dedicated to data transmission in each interval of $S_{t}$ OFDM symbols. Moreover, the estimated channel is sent back to the transmitter for PA (i.e., Algorithm 2). Notably the power can be equally allocated in the pilot OFDM symbol.

The block diagram of Algorithm 3 is illustrated in Fig. 3. The operations of Module A in Algorithm 3 is identical to that in Algorithm 1. The output of Module A $\mathbf{x}_{B}^{{\rm pri}}$ can be viewed as the equivalent channel in (25), where each subchannel component is expressed by the product of the transmitted signal and the channel frequency response at the corresponding subchannel plus an AWGN with power $v_{B}^{\rm pri}$ . This decoupling property facilitates the subsequent channel estimation and data detection. In particular, through $\mathbf{x}_{B}^{{\rm pri}}$ , we can process the pilot subchannel $\mathcal{X}_{p}$ and the data subchannel $\mathcal{X}_{d}$ separately. For example, (45) employs the least squares method to obtain an initial channel estimation.

Once the initial channel estimate in the first iteration is obtained, the estimated channel is updated using the decision-direct (DD) technique in the subsequent iterations. Specifically, in the $t$ -th iteration, the DD technique uses the detected signal in the $(t-1)$ -th iteration $\hat{\mathbf{s}}^{t-1}$ to estimate $\mathbf{h}$ coarsely in (46). Afterward, we transform the coarsely estimated frequency channel response $\tilde{\mathbf{h}}$ to the time domain in (47a) and refine the estimate by eliminating the effect of noise outside the maximum channel delay $L$ in (47d). Finally, we transform $\mathbf{\hat{g}}$ back to the frequency domain in (47e).

Subsequently, we use the estimated channel $\hat{\mathbf{h}}$ for data detection. The posteriori mean and variance of the data symbols can be calculated similar to (27a) and (27b) while replacing the exact channel response $h_{j}$ with estimated channel response $\hat{h}_{j}$ as shown in (48a) and (48b). For $j\in{\mathcal{X}_{d}}$ , the decision ${{\hat{s}}^{t}_{j}}$ is made according to the rule (28), while for $j\in{\mathcal{X}_{p}}$ , ${{\hat{s}}^{t}_{j}}$ takes the pilot signal. In step (6) of Algorithm 3, the extrinsic mean and variance of $\mathbf{z}$ are computed and used as the input of Module A. Similar to Algorithm 1, two modules are executed iteratively until convergence.

VI Simulation Results

Computer simulations are conducted to evaluate the performance of the proposed algorithms and verify the accuracy of our analysis. In the simulations, the number of OFDM subchannels is $N=512$ and the number of channel taps is $L=4$ . The channel impulse response $g_{i}$ for $i=1,\cdots,L$ is assumed to be i.i.d. with PDF $\mathcal{CN}(g_{i};0,N/L)$ . Each entry of the transmitted symbols $\mathbf{s}$ is drawn from the equiprobable QPSK constellation without specific indication. We set ${\rm E}[|s_{j}|^{2}]=1$ for $j=1,\cdots,N$ , thus the average SNR can be given by $1/\sigma^{2}$ . The SER, which is averaged over all subchannels, is obtained through the Monte-Carlo simulations of 1,000 independent channel realizations.

Fig. 4a shows the SERs versus the iteration numbers of the proposed detector, that is, Algorithm 1, under the quantization precision of 1–3 bits. The simulated SERs are obtained by the Monte-Carlo simulations of Algorithm 1, while the SE predictions are evaluated using (31) and (34). The SERs under two different PA schemes, i.e., the ESPA and the AMSER PA proposed in Algorithm 2, are evaluated. Fig. 4a shows that the proposed detector evidently converges within five iterations, and the SE predictions match well with the simulated results for all quantization settings and PA schemes. Furthermore, we observe significant SER gaps between the AMSER PA and the ESPA, which validate the effectiveness of the PA scheme proposed in Algorithm 2. To analyze the asymptotic behavior, we show the simulated and SE results for Algorithm 1 with $N=64$ and 32 under ESPA in Fig. 4b. It is shown that the performance of proposed detector is very close to the Bayesian optimal performance in the large system limit, where $N\to\infty$ , even for a small number of subcarriers.

Fig. 5 compares the SERs of the proposed GTurbo-based detector with the existing detectors including the GAMP-based detector [35] and the conventional detector using the one-tap equalizer expressed in (6). The corresponding SERs under the AMSER PA and the ESPA are shown in Figs. 5a and 5b, respectively. Notably, the proposed detector significantly outperforms the other two detectors in terms of SER performance. The poor performance obtained by the conventional detector and the GAMP-based detector can be understood as follows: The conventional detector completely ignores the ICI effect caused by low resolution ADCs. Although the GAMP-based detector considers the ICI effect, this detector regards the linear transformation matrix of the detection problem (5) as the i.i.d. entries, and it does not exploit the orthogonality property of the OFDM waveform. Notably, the proposed detector has already achieved the best performance of the Bayesian optimal detector, which indicates that no further improvement is required. The figures show the optimal SER performance of the OFDM system with infinite-resolution ADCs as the benchmark. We observe that the SER performance of the GTurbo-based detector with AMSER PA is similar to the optimal performance of the infinite-precision OFDM system. This result illustrates the feasibility of using very-low-resolution ADCs at the receiver in OFDM systems. Note that only the sign of real and imaginary parts of the analog received signal the quantized is preserved under 1-bit quantization. The amplitude information of the analog received signal is completely lost. Under such cases with serious non-linear distortion, neither GTurbo- nor GAMP-based detector yields good detection performance without array gain arising from the large-scale antenna array at the receiver as in [19], or involving channel coding.

Particularly, the proposed GTurbo-based detector also works well for high-order modulations such as 16QAM shown in Fig. 5c. When advanced coding techniques, such as [48], are involved, the transmission of high-order modulation under lower quantization bits and SNR region can be properly supported. In order to avoid that the key advantages of the proposed detector be obfuscated by other coding technique, we leave this high-order modulation supporting transmission strategies for the future work.

In Module A of the GTurbo-based detector, we reconstruct $\mathbf{z}$ from the quantized observation $\mathbf{q}$ using the Bayesian MMSE estimate in (18). Another widely used way to deal with quantization noise is to model it as an additive and independent noise, that is, AQNM [49], which allows the use of linear detectors. Figs. 6a and 6b compare the optimal detection performances based on the exact quantization model and the AQNM. Notably, the optimal detection algorithm developed for the AQNM suffers from significant performance loss and severe error floor compared with that for the exact model. The main reason is that AQNM assumes that the input of the quantizer $y_{j}$ is a Gaussian variable and approximates the correlated quantization noise by an independent Gaussian noise, which cannot provide a satisfactory approximation to the strongly nonlinear relation (5) under the quantization resolution of 1–3 bits. Furthermore, the comparison of Figs. 5a and 5b and that of Figs. 6a and 6b illustrate that the use of AMSER PA substantially improves the SER performance. The decline of SER versus SNR becomes steeper when the PA is performed.

Finally, we examine the channel estimation of the pilot-based OFDM system where the pilot OFDM symbol is arranged as that in Fig. 2 with $S_{f}=16$ . The MSE of the channel estimate is defined as $\mathrm{MSE}=\frac{1}{N}\mathrm{E}\left[||\mathbf{h}-\hat{\mathbf{h}}||^{2}\right]$ . Fig. 7a shows the MSE of the channel estimation implemented in Algorithm 3 and the GAMP-based data detection combined with the least square channel estimation method and the refinement technique in (47). We observe that the proposed channel estimation significantly outperforms the GAMP-based scheme, particularly for the quantization precision of 2–3 bits. To further evaluate the performance of the proposed channel estimation algorithm, we compare the detection performance under perfect and estimated CSI, as shown Fig. 7b. The gap between two cases is comparatively small, especially for 3-bit quantization. These results justify the feasibility of obtaining high-quality CSI with low-precision ADCs at the receiver without significant pilot overhead.

VII Conclusion

We proposed an efficient algorithm for optimal data detection in the Q-OFDM system emerging from mmWave communications. The SE equations of the proposed detector were derived and shown to be identical to those obtained from the Bayesian optimal detector via the replica theory. We described the decoupling principle, from which a PA scheme was developed to further improve the SER performance. Under a unified framework, we also developed a feasible method for channel estimation so that the Q-OFDM detector can be applied to a practical scenario without perfect CSI. The simulation results provided the following useful observations:

•

The algorithm converges rapidly, and its SE prediction is consistent with the simulated result, which ensures the quick and efficient performance analysis for the Q-OFDM system.

•

The proposed PA scheme improves the SER performance significantly and alleviate the error floor compared with the ESPA scheme.

•

The optimal detector for the Q-OFDM system entails acceptable performance loss compared with that for the infinite-precision case, which confirms the feasibility of the proposed Q-OFDM receiver.

•

Approximating the input-output relationship of a coarse quantizer by AQNM yields worse detection performance in the Q-OFDM system.

•

High-quality CSI is available under the Q-OFDM system without significant pilot overhead.

Appendix A Proof of Proposition 1

In this Appendix, we present the derivation of the SE equations for Algorithm 1 by following [38]. In the large-system limit where $N\to\infty$ , $v_{A}^{\mathrm{post}}$ in (19a) converges to the expectation of $v_{j,A}^{\mathrm{post}}$ w.r.t. $z_{j,A}^{\mathrm{pri}}$ and $q_{j}$ according to the large-number theorem. For the ease of computation, we first derive the expectation of real part of $\mathrm{var}\left[z^{R}_{j}\mid q^{R}_{j}\right]$ and add the expectations of $\mathrm{var}\left[z^{R}_{j}\mid q^{R}_{j}\right]$ and $\mathrm{var}\left[z^{I}_{j}\mid q^{I}_{j}\right]$ together. To obtain these expectations, we need the joint distribution $\mathrm{P}(\mathrm{z_{j,A}^{\mathrm{pri},\mathrm{R}}},q^{\mathrm{R}}_{j})$ , which can be computed by $\mathrm{P}(z_{j,A}^{\mathrm{pri},\mathrm{R}},q^{\mathrm{R}}_{j})=\int\mathrm{P}(q^{\mathrm{R}}_{j}|z_{j,A}^{\mathrm{pri},\mathrm{R}},z^{\mathrm{R}}_{j})\mathrm{P}(z_{j,A}^{\mathrm{pri},\mathrm{R}},z^{\mathrm{R}}_{j})\mathrm{d}z^{\mathrm{R}}_{j}$ . The joint distribution of $z_{j,A}^{\mathrm{pri},\mathrm{R}}$ and $z^{\mathrm{R}}_{j}$ is given by [38]

[TABLE]

where $v_{x}=E(|x_{j}|^{2})=\frac{1}{N}\sum_{j=1}^{N}|h^{\prime}_{j}|^{2}$ . Given that $q^{\mathrm{R}}_{j}$ is independent of $z_{j,A}^{\mathrm{pri},\mathrm{R}}$ , we have $\mathrm{P}(q^{\mathrm{R}}_{j}|z_{j,A}^{\mathrm{pri},\mathrm{R}},z^{\mathrm{R}}_{j})=\mathrm{P}(q^{\mathrm{R}}_{j}|z^{\mathrm{R}}_{j})$ ; we therefore have the following:

[TABLE]

Combining (50) and (51), we have the following:

[TABLE]

where (a) is obtained according to the property given by [50, (A.7)] and the definition of $\Psi(\cdot)$ . To compute the expectation, we rewrite $\mathrm{var}{\left[z^{R}_{j}\mid q^{R}_{j}\right]}$ as follows:

[TABLE]

We then compute the expectations of $v_{1}$ and $v_{2}$ w.r.t. $(q^{\mathrm{R}}_{j},z_{j,A}^{\mathrm{pri},\mathrm{R}})$ as follows:

[TABLE]

where (b) is obtained by defining the transformation $z_{j,A}^{\mathrm{pri},\mathrm{R}}=\sqrt{\frac{v_{x}-v_{A}^{\mathrm{pri}}}{2}}z$ , and (c) follows from the fact that $\lim_{\eta\to\infty}\eta\phi(\eta)=0$ and $\lim_{\eta\to-\infty}\eta\phi(\eta)=0$ . The expectation of $\mathrm{var}{\left[z^{\mathrm{I}}_{j}\mid q^{\mathrm{I}}_{j}\right]}$ can be computed similarly, and then the expectation of $v^{\mathrm{post}}_{j,A}$ can be obtained by

[TABLE]

Substituting (29) into (19a) and (19b) yields (31a) and (31b).

In the same way, $v_{B}^{\rm post}$ converges to the expectation of $v_{j,B}^{\mathrm{post}}$ w.r.t. $x_{j,B}^{\mathrm{pri}}$ and $h^{\prime}_{j}$ . We first calculate the expectation of $v_{j,B}^{\mathrm{post}}$ w.r.t. $x_{j,B}^{\mathrm{pri}}$ for the given $h^{\prime}_{j}$ elementwisely, i.e., $\mathrm{mmse}(|h^{\prime}_{j}|^{2}\eta)$ . Moreover, substituting $v_{j,B}^{\mathrm{post}}=\mathrm{mmse}(|h^{\prime}_{j}|^{2}\eta)$ into (21b) and (21c) yields (31c).

Appendix B Derivation of the Saddle-point of $\mathcal{F}$

In this Appendix, we adopt the replica theory in the field of statistical physics to calculate $\mathcal{F}$ in the large-system limit and derive its saddle points, which yield the following proposition.

Proposition 2

The saddle-point of $\mathcal{F}$ can be obtained from the iteration given by

[TABLE]

$\blacksquare$ **

Proof:

From [51], $\mathcal{F}$ can be rewritten as follows:

[TABLE]

The expectation operator is moved inside the log-function. We first evaluate $\mathrm{E}\left[\mathrm{P}^{\tau}(\mathbf{q};\mathbf{h}^{\prime})\right]$ for an integer-valued $\tau$ , and then generalize the result to any positive real number $\tau$ .

For ease of expression, we denote ${\bf A}={\bf F}^{H}\mathrm{diag}(\mathbf{h}){\bf P}^{\frac{1}{2}}$ and use ${\bf a}_{n}^{H}$ to denote the $n$ th row of ${\bf A}$ . Then we rewrite the likelihood (9) as follows:

[TABLE]

where $\delta(\cdot)$ denotes Dirac’s delta. Using the Fourier representation of the $\delta$ via auxiliary variables ${\bf w}=[w_{m}]\in{\mathbb{C}}^{N}$ to (58), we obtain

[TABLE]

Using (57), we compute the replicate partition function $\mathrm{E}\left[\mathrm{P}^{\tau}(\mathbf{q};\mathbf{h}^{\prime})\right]$ given by

[TABLE]

where ${\bf z}^{(a)}$ and ${\bf s}^{(a)}$ are the $a$ -th replica of ${\bf z}$ and ${\bf s}$ , respectively; and ${\bf Z}\triangleq\{{\bf z}^{(a)},\forall a\}$ , ${\bf W}\triangleq\{{\bf w}^{(a)},\forall a\}$ , ${\bf S}\triangleq\{{\bf s}^{(a)},\forall a\}$ . Here, $\{{\bf s}^{(a)}\}$ are random vectors taken from the distribution $\mathrm{P}(\mathbf{s})$ for $a=1,\dots,\tau$ . In addition, $\int{\rm d}{\bf q}$ denotes the integral w.r.t. a discrete measure because the quantized output ${\bf q}$ is a finite set.

To evaluate the expectation w.r.t. ${\bf A}$ and ${\bf S}$ in (63), we introduce two $\tau\times\tau$ matrices ${\bf Q}_{s}$ and ${\bf Q}_{w}$ whose elements are defined by $[{\bf Q}_{s}]_{a,b}\triangleq\frac{1}{N}\left({\bf s}^{(a)}\right)^{H}{\bf s}^{(b)}$ and $[{\bf Q}_{w}]_{a,b}\triangleq\frac{1}{N}\left({\bf w}^{(a)}\right)^{H}{\bf w}^{(b)}$ . The definitions of ${\bf Q}_{s}$ and ${\bf Q}_{w}$ are equivalent to

[TABLE]

where $\delta(\cdot)$ denotes Dirac’s delta. Inserting the above expressions into (60) yields

[TABLE]

where ${\cal G}^{(\tau)}({\bf Q}_{s},{\bf Q}_{w})$ , $\mu^{(\tau)}({\bf Q}_{s})$ , and $\mu^{(\tau)}({\bf Q}_{w})$ are given by

[TABLE]

We notice that by introducing the $\delta$ -functions, the expectations over ${\bf S}$ can be separated into an expectation over all possible covariance ${\bf Q}_{s}$ and all possible ${\bf S}$ configurations w.r.t. a prescribed set of ${\bf Q}_{s}$ . Therefore, we can separate the expectations over ${\bf A}$ and ${\bf S}$ respectively in (64a) and (64b). A similar concept applies to separating the expectations over ${\bf A}$ and ${\bf W}$ . We next calculate each term of (64).

First, we evaluate ${\cal G}^{\tau}({\bf Q}_{s},{\bf Q}_{w})$ by noticing

[TABLE]

where ${\boldsymbol{\Lambda}}^{\frac{1}{2}}=\mathrm{diag}(\mathbf{h}){\bf P}^{\frac{1}{2}}$ , $\tilde{\bf w}^{(a)}={\bf F}{\bf w}^{(a)}$ , and $\tilde{\bf s}^{(a)}={\boldsymbol{\Lambda}}^{\frac{1}{2}}{\bf s}^{(a)}$ . The covariances of $(\tilde{\bf s}^{(a)},\tilde{\bf s}^{(b)})$ and $(\tilde{\bf w}^{(a)},\tilde{\bf w}^{(b)})$ are given by the following:

[TABLE]

Notice that the dependence on the replica indices would not affect the physics of the system because replicas have been introduced artificially. Assuming replica symmetry (RS), i.e.,

[TABLE]

therefore seems natural. With the RS, we can obtain follows [42]:

[TABLE]

where

[TABLE]

and $\mathop{\mathsf{Extr}}_{x}\{f(x)\}$ denotes the extreme value of $f(x)$ w.r.t. $x$ .

Next, we consider $\mu^{(\tau)}({\bf Q}_{s})$ in (64b). It can be shown that $\mu^{(\tau)}({\bf Q}_{s})=e^{N{\cal R}_{s}^{(\tau)}({\bf Q}_{s})+{\cal O}(1)}$ , where ${\cal R}_{s}^{(\tau)}({\bf Q}_{s})$ is the rate measure of $\mu^{(\tau)}({\bf Q}_{s})$ and is given by [52]

[TABLE]

with $\tilde{\bf Q}_{s}\in{\mathbb{R}}^{\tau\times\tau}$ being a symmetric matrix. Furthermore, we assume the RS, i.e., $\tilde{\bf Q}_{s}=\tilde{q}_{s}{\bf 11}^{H}-\tilde{c}_{s}{\bf I}_{\tau}$ . With the RS, and using the Hubbard-Stratonovich transformation and introducing the auxiliary vector ${\bf u}_{s}\in\mathbb{C}^{N}$ , the first term of (71) can be written as follows:

[TABLE]

With the RS assumption, the last term of (71) can now be expressed as follows:

[TABLE]

Substituting (72) and (73) into (71) and taking the derivative w.r.t. $\tau$ at $\tau=0$ , we obtain the following:

[TABLE]

Similarly, we calculate $\mu^{(\tau)}({\bf Q}_{w})$ in (64c) and assume the RS $\tilde{\bf Q}_{w}=-\tilde{q}_{w}{\bf 11}^{H}-\tilde{c}_{w}{\bf I}_{\tau}$ . It can be shown that $\mu^{(\tau)}({\bf Q}_{w})=e^{N{\cal R}_{w}^{(\tau)}({\bf Q}_{w})+{\cal O}(1)}$ , where ${\cal R}_{w}^{(\tau)}({\bf Q}_{w})$ is the rate measure of $\mu^{(\tau)}({\bf Q}_{w})$ and is given by the following:

[TABLE]

where we define

[TABLE]

By using the Hubbard-Stratonovich transformation and introducing the auxiliary vector ${\bf u}_{w}\in\mathbb{C}^{N}$ , we obtain

[TABLE]

where the last equality follows the facts that ${\bf v}_{w}\triangleq\frac{1}{\sqrt{\tilde{c}_{w}}}\left(\sqrt{\tilde{q}_{w}}{\bf u}_{w}-{\bf z}\right)$ and ${\rm D}{\bf v}_{w}=\frac{1}{\pi^{N}}e^{-{\bf v}_{w}^{H}{\bf v}_{w}}$ . With the RS assumption, the last term of (75) can now be expressed as follows:

[TABLE]

Substituting (B) and (77) into (75) and taking the derivative w.r.t. $\tau$ at $\tau=0$ , we obtain the following:

[TABLE]

Applying (71) and (75) the integration over ${\bf Q}$ in (63) can be performed via the saddle point method as $N\rightarrow\infty$ , which yields the following:

[TABLE]

With the normalization constraint $\mathrm{E}\left[\mathrm{P}^{\tau}(\mathbf{q};\mathbf{h}^{\prime})\right]=1$ , we can obtain that $c_{s}+q_{s}=v_{x}$ , $c_{w}-q_{w}=0$ , $-\tilde{c}_{s}+\tilde{q}_{s}=0$ , and $\tilde{c}_{w}+\tilde{q}_{w}=v_{x}$ . Substituting (69), (74), and (78) into (79), and combining it with the aforementioned relationships, we obtain $\partial{\cal F}^{(\tau)}/\partial\tau$ at $\tau=0$ as follows:

[TABLE]

where

[TABLE]

The saddle-point of (80) can be rewritten as

[TABLE]

From (69), we obtain that the extremum points should satisfy the following equality

[TABLE]

Substituting (85) into (84a), (84c) and (84d), we obtain Proposition 2. ∎

Bibliography52

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] H. Wang, T. Liu, C.-K. Wen, and S. Jin, “Optimal data detection for OFDM system with Low-Resolution quantization,” in Proc. IEEE Int. Conf. Commun. Syst. (ICCS) , Shenzhen, P.R. China, Dec.14-16, 2016.
2[2] R. W. Heath, N. Gonz¨¢lez-Prelcic, S. Rangan, W. Roh, and A. M. Sayeed, “An overview of signal processing techniques for millimeter wave mimo systems,” IEEE J. Sel. Topics Signal Process. , vol. 10, no. 3, pp. 436–453, Apr. 2016.
3[3] IEEE Std 802.11ad-2012 (Amendment to IEEE Std 802.11-2012) , Std., 2012.
4[4] IEEE Std 802.15.3c-2009 (Amendment to IEEE Std 802.15.3-2003) , Std., Oct. 2009.
5[5] T. S. Rappaport, S. Sun et al. , “Millimeter wave mobile communications for 5G cellular: It will work!” IEEE Access , vol. 1, p. 335¨C 349, May 2013.
6[6] V. Va, T. Shimizu et al. , “Millimeter wave vehicular communications: A survey,” Foundations and Trends in Networking , vol. 10, no. 1, pp. 1–113, Jun. 2016.
7[7] K. Venugopal and R. W. Heath, “Millimeter wave networked wearables in dense indoor environments,” IEEE Access , vol. 4, pp. 1205–1221, Mar. 2016.
8[8] R. Walden, “Analog-to-digital converter survey and analysis,” IEEE J. Sel. Areas Commun. , vol. 17, no. 4, p. 539¨C 550, Apr. 1999.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Bayesian Optimal Data Detector for mmWave OFDM System with Low-Resolution ADC

Abstract

Index Terms:

I Introduction

II System Model

Remark 1

Remark 2

III Optimal Data Detection

III-A Theoretical Foundation

Remark 3

III-B GTurbo-based Algorithm

IV State Evolution and Power Allocation

IV-A State Evolution

Proposition 1

Remark 4

Remark 5

IV-B Power Allocation

IV-C Computational Complexity

V Channel Estimation

VI Simulation Results

VII Conclusion

Appendix A Proof of Proposition 1

Appendix B Derivation of the Saddle-point of F\mathcal{F}F

Proposition 2

Proof:

Appendix B Derivation of the Saddle-point of $\mathcal{F}$