Direct Data Detection of OFDM Signals Over Wireless Channels

Anas Saci; Arafat Al-Dweik; and Abdallah Shami

arXiv:1902.03382·cs.IT·December 6, 2019

Direct Data Detection of OFDM Signals Over Wireless Channels

Anas Saci, Arafat Al-Dweik, and Abdallah Shami

PDF

TL;DR

This paper introduces a direct data detection (D^{3}) method for OFDM wireless signals that eliminates the need for channel estimation, achieving near-coherent performance with reduced complexity.

Contribution

The paper proposes a novel receiver design that combines channel estimation, equalization, and data detection into a single operation for OFDM systems.

Findings

01

D^{3} achieves BER within 3 dB of perfect CSI coherent detectors.

02

D^{3} outperforms traditional coherent detectors when CSI is imperfect.

03

Complexity can be significantly reduced using the Viterbi algorithm.

Abstract

This paper presents a novel efficient receiver design for wireless communication systems that incorporate orthogonal frequency division multiplexing (OFDM) transmission. The proposed receiver does not require channel estimation or equalization to perform coherent data detection. Instead, channel estimation, equalization, and data detection are combined into a single operation, and hence, the detector is denoted as a direct data detector (D^{3}). The performance of the proposed system is thoroughly analyzed theoretically in terms of bit error rate (BER), and validated by Monte Carlo simulations. The obtained theoretical and simulation results demonstrate that the BER of the proposed D^{3} is only 3 dB away from coherent detectors with perfect knowledge of the channel state information (CSI) in flat fading channels, and similarly in frequency-selective channels for a wide range of…

Tables3

Table 1. TABLE I: Computational complexity comparison using different values of N 𝑁 N , N P = N / 4 subscript 𝑁 𝑃 𝑁 4 N_{P}=N/4 , for BPSK.

$N$	$128$	$256$	$512$	$1024$	$2048$
$η_{R_{A}}$	$0.58$	$1.07$	$1.21$	$1.27$	$1.31$
$η_{R_{M}}$	$0.77$	$0.72$	$0.68$	$0.64$	$0.61$
$R_{D}$	$96$	$192$	$384$	$768$	$1536$
$η_{P}$	$0.20$	$0.21$	$0.22$	$0.26$	$0.31$

Table 2. TABLE II: Computational complexity comparison using different values of N 𝑁 N , N P = N / 4 subscript 𝑁 𝑃 𝑁 4 N_{P}=N/4 , for 16-QAM and 64-QAM.

$M$	$16$		$64$
$N$	$512$	$2048$	$512$	$2048$
$η_{R_{A}}$	$1.25$	$1.25$	$1.64$	$1.64$
$η_{R_{M}}$	$0.52$	$0.47$	$0.64$	$0.62$
$η_{R_{D}}$	$0.98$	$0.98$	$0.99$	$0.99$
$η_{P}$	$0.94$	$0.84$	$0.91$	$0.80$

Table 3. TABLE III: Computational complexity comparison using hard and soft VA for different values of K 𝐾 K , N = 2048 𝑁 2048 N=2048 .

$K$	$3$	$4$	$5$	$6$	$7$
Soft	$0.96$	$0.97$	$0.97$	$0.98$	$0.99$
Hard	$0.24$	$0.26$	$0.28$	$0.33$	$0.41$

Equations164

r = Hd + w

r = Hd + w

H = diag {[H_{0}, H_{1}, \dots, H_{N - 1}]} .

H = diag {[H_{0}, H_{1}, \dots, H_{N - 1}]} .

\hat{d} = ar g \tilde{d} min r - H \tilde{d}^{2}

\hat{d} = ar g \tilde{d} min r - H \tilde{d}^{2}

\hat{d}_{v} = ar g \tilde{d}_{v} min r_{v} - H_{v} \tilde{d}_{v}^{2} .

\hat{d}_{v} = ar g \tilde{d}_{v} min r_{v} - H_{v} \tilde{d}_{v}^{2} .

\overset{ˇ}{r} = [\hat{H}^{H} \hat{H}]^{- 1} \hat{H}^{H} r

\overset{ˇ}{r} = [\hat{H}^{H} \hat{H}]^{- 1} \hat{H}^{H} r

\hat{d}_{v} = ar g \tilde{d}_{v} min \overset{r}{ˇ}_{v} - \tilde{d}_{v}^{2}, \forall v .

\hat{d}_{v} = ar g \tilde{d}_{v} min \overset{r}{ˇ}_{v} - \tilde{d}_{v}^{2}, \forall v .

\hat{d} = ar g \tilde{d} max \frac{d ~ ^{H} r ^{2}}{∥ d ~ ∥} .

\hat{d} = ar g \tilde{d} max \frac{d ~ ^{H} r ^{2}}{∥ d ~ ∥} .

ϱ_{f}

ϱ_{f}

Δ_{f} = E [H_{v} - H_{\overset{v}{ˊ}}] = E [m = 0 \sum D_{h} h_{n} e^{- j 2 π \frac{m v}{N}} (1 - e^{- j 2 π \frac{m}{N}})]

Δ_{f} = E [H_{v} - H_{\overset{v}{ˊ}}] = E [m = 0 \sum D_{h} h_{n} e^{- j 2 π \frac{m v}{N}} (1 - e^{- j 2 π \frac{m}{N}})]

ϱ_{t} = E [H_{v}^{ℓ} (H_{v}^{\overset{ˊ}{ℓ}})^{*}] = J_{0} (2 π f_{d} T_{s})

ϱ_{t} = E [H_{v}^{ℓ} (H_{v}^{\overset{ˊ}{ℓ}})^{*}] = J_{0} (2 π f_{d} T_{s})

\hat{d} = ar g \tilde{d} min v = 0 \sum N - 2 \frac{r _{v}}{d ~ _{v}} - \frac{r _{\overset{v}{ˊ}}}{d ~ _{\overset{v}{ˊ}}}^{2} .

\hat{d} = ar g \tilde{d} min v = 0 \sum N - 2 \frac{r _{v}}{d ~ _{v}} - \frac{r _{\overset{v}{ˊ}}}{d ~ _{\overset{v}{ˊ}}}^{2} .

\hat{D}_{L, K} = ar g \tilde{D}_{L, K} min J (\tilde{D}_{L, K})

\hat{D}_{L, K} = ar g \tilde{D}_{L, K} min J (\tilde{D}_{L, K})

J (\tilde{D}_{L, K}) = ℓ = 0 \sum L - 1 v = 0 \sum K - 2 \frac{r _{v}^{ℓ}}{d ~ _{v}^{ℓ}} - \frac{r _{\overset{v}{ˊ}}^{ℓ}}{d ~ _{\overset{v}{ˊ}}^{ℓ}}^{2} + \frac{r _{v}^{ℓ}}{d ~ _{v}^{ℓ}} - \frac{r _{v}^{\overset{ˊ}{ℓ}}}{d ~ _{v}^{\overset{ˊ}{ℓ}}}^{2} .

J (\tilde{D}_{L, K}) = ℓ = 0 \sum L - 1 v = 0 \sum K - 2 \frac{r _{v}^{ℓ}}{d ~ _{v}^{ℓ}} - \frac{r _{\overset{v}{ˊ}}^{ℓ}}{d ~ _{\overset{v}{ˊ}}^{ℓ}}^{2} + \frac{r _{v}^{ℓ}}{d ~ _{v}^{ℓ}} - \frac{r _{v}^{\overset{ˊ}{ℓ}}}{d ~ _{v}^{\overset{ˊ}{ℓ}}}^{2} .

\hat{D}

\hat{D}

\begin{array}[]{ccc}\Gamma_{\acute{c}}^{U}=\min\left[\Gamma_{c}^{U}\text{, }\acute{\Gamma}_{c}^{U}\right]+J_{00}^{c}&&\Gamma_{\acute{c}}^{L}=\min\left[\Gamma_{c}^{L}\text{, }\acute{\Gamma}_{c}^{L}\right]+J_{01}^{c}\\ \acute{\Gamma}_{\acute{c}}^{U}=\min\left[\Gamma_{c}^{U}\text{, }\acute{\Gamma}_{c}^{U}\right]+J_{10}^{c}&&\acute{\Gamma}_{\acute{c}}^{L}=\min\left[\Gamma_{c}^{L}\text{, }\acute{\Gamma}_{c}^{L}\right]+J_{11}^{c}\end{array}

\begin{array}[]{ccc}\Gamma_{\acute{c}}^{U}=\min\left[\Gamma_{c}^{U}\text{, }\acute{\Gamma}_{c}^{U}\right]+J_{00}^{c}&&\Gamma_{\acute{c}}^{L}=\min\left[\Gamma_{c}^{L}\text{, }\acute{\Gamma}_{c}^{L}\right]+J_{01}^{c}\\ \acute{\Gamma}_{\acute{c}}^{U}=\min\left[\Gamma_{c}^{U}\text{, }\acute{\Gamma}_{c}^{U}\right]+J_{10}^{c}&&\acute{\Gamma}_{\acute{c}}^{L}=\min\left[\Gamma_{c}^{L}\text{, }\acute{\Gamma}_{c}^{L}\right]+J_{11}^{c}\end{array}

\hat{d} = ar g \tilde{u} \in U min v = 0 \sum N - 2 \frac{r _{v}}{u ~ _{v}} - \frac{r _{\overset{v}{ˊ}}}{u ~ _{\overset{v}{ˊ}}}^{2}

\hat{d} = ar g \tilde{u} \in U min v = 0 \sum N - 2 \frac{r _{v}}{u ~ _{v}} - \frac{r _{\overset{v}{ˊ}}}{u ~ _{\overset{v}{ˊ}}}^{2}

\hat{d}_{l} = ar g \tilde{d} min v = l \sum K - 2 + l \frac{r _{v}}{d _{v} ~} - \frac{r _{\overset{v}{ˊ}}}{d ~ _{\overset{v}{ˊ}}}^{2} K \in {2, 3, \dots, N - 1}

\hat{d}_{l} = ar g \tilde{d} min v = l \sum K - 2 + l \frac{r _{v}}{d _{v} ~} - \frac{r _{\overset{v}{ˊ}}}{d ~ _{\overset{v}{ˊ}}}^{2} K \in {2, 3, \dots, N - 1}

\hat{d}_{0} = ar g \tilde{d} min (\frac{r _{0}}{d _{0} ~} - \frac{r _{1}}{d ~ _{1}}) (\frac{r _{0}}{d _{0} ~} - \frac{r _{1}}{d ~ _{1}})^{*} + \dots + (\frac{r _{K - 2}}{d ~ _{K - 2}} - \frac{r _{K - 1}}{d ~ _{K - 1}}) (\frac{r _{K - 2}}{d ~ _{K - 2}} - \frac{r _{K - 1}}{d ~ _{K - 1}})^{*}

\hat{d}_{0} = ar g \tilde{d} min (\frac{r _{0}}{d _{0} ~} - \frac{r _{1}}{d ~ _{1}}) (\frac{r _{0}}{d _{0} ~} - \frac{r _{1}}{d ~ _{1}})^{*} + \dots + (\frac{r _{K - 2}}{d ~ _{K - 2}} - \frac{r _{K - 1}}{d ~ _{K - 1}}) (\frac{r _{K - 2}}{d ~ _{K - 2}} - \frac{r _{K - 1}}{d ~ _{K - 1}})^{*}

\hat{d}_{0} = ar g \tilde{d} min \frac{r _{0}}{d _{0} ~}^{2} + \frac{r _{1}}{d _{1} ~}^{2} + \dots + \frac{r _{K - 1}}{d ~ _{K - 1}}^{2} - \frac{r _{0}}{d _{0} ~} \frac{r _{1}}{d ~ _{1}^{*}} - \frac{r _{0}}{d _{0}^{*} ~} \frac{r _{1}}{d ~ _{1}} - \dots - \frac{r _{K - 2}}{d ~ _{K - 2}} \frac{r _{K - 1}}{d ~ _{K - 1}^{*}} - \frac{r _{K - 2}}{d ~ _{K - 2}^{*}} \frac{r _{K - 1}}{d ~ _{K - 1}} .

\hat{d}_{0} = ar g \tilde{d} min \frac{r _{0}}{d _{0} ~}^{2} + \frac{r _{1}}{d _{1} ~}^{2} + \dots + \frac{r _{K - 1}}{d ~ _{K - 1}}^{2} - \frac{r _{0}}{d _{0} ~} \frac{r _{1}}{d ~ _{1}^{*}} - \frac{r _{0}}{d _{0}^{*} ~} \frac{r _{1}}{d ~ _{1}} - \dots - \frac{r _{K - 2}}{d ~ _{K - 2}} \frac{r _{K - 1}}{d ~ _{K - 1}^{*}} - \frac{r _{K - 2}}{d ~ _{K - 2}^{*}} \frac{r _{K - 1}}{d ~ _{K - 1}} .

\hat{d}_{0} = ar g \tilde{d_{0}} max v = 0 \sum K - 2 ℜ {\frac{r _{v} r _{\overset{v}{ˊ}}}{d _{v} ~ d ~ _{\overset{v}{ˊ}}}} .

\hat{d}_{0} = ar g \tilde{d_{0}} max v = 0 \sum K - 2 ℜ {\frac{r _{v} r _{\overset{v}{ˊ}}}{d _{v} ~ d ~ _{\overset{v}{ˊ}}}} .

\hat{d}_{0} = ar g \tilde{d}_{0} \in / \tilde{d}_{0} max \frac{1}{d _{1} ~} ℜ {r_{0} r_{1}} + v = 1 \sum K - 2 \frac{1}{d _{v} ~ d ~ _{\overset{v}{ˊ}}} ℜ {r_{v} r_{\overset{v}{ˊ}}} .

\hat{d}_{0} = ar g \tilde{d}_{0} \in / \tilde{d}_{0} max \frac{1}{d _{1} ~} ℜ {r_{0} r_{1}} + v = 1 \sum K - 2 \frac{1}{d _{v} ~ d ~ _{\overset{v}{ˊ}}} ℜ {r_{v} r_{\overset{v}{ˊ}}} .

P_{S} ∣_{H_{0}, d_{0}} ≜ Pr (\hat{d_{0}} \neq = d_{0})_{H_{0}, d_{0}}

P_{S} ∣_{H_{0}, d_{0}} ≜ Pr (\hat{d_{0}} \neq = d_{0})_{H_{0}, d_{0}}

P_{C} ∣_{H_{0}, d_{0}} = 1 - Pr (\hat{d_{0}} = d_{0}) ∣_{H_{0}, d_{0}} .

P_{C} ∣_{H_{0}, d_{0}} = 1 - Pr (\hat{d_{0}} = d_{0}) ∣_{H_{0}, d_{0}} .

P_{C} ∣_{H_{0}, 1} = Pr (v = 0 \sum K - 2 ℜ {r_{v} r_{\overset{v}{ˊ}}} = \tilde{d_{0}} max {v = 0 \sum K - 2 \frac{ℜ { r _{v} r _{\overset{v}{ˊ}} }}{d _{v} ~ d ~ _{\overset{v}{ˊ}}}}) .

P_{C} ∣_{H_{0}, 1} = Pr (v = 0 \sum K - 2 ℜ {r_{v} r_{\overset{v}{ˊ}}} = \tilde{d_{0}} max {v = 0 \sum K - 2 \frac{ℜ { r _{v} r _{\overset{v}{ˊ}} }}{d _{v} ~ d ~ _{\overset{v}{ˊ}}}}) .

P_{C} ∣_{H_{0}, 1} = Pr (A_{ψ} > A_{ψ - 1}, A_{ψ - 2}, \dots, A_{0})

P_{C} ∣_{H_{0}, 1} = Pr (A_{ψ} > A_{ψ - 1}, A_{ψ - 2}, \dots, A_{0})

P_{C} ∣_{H_{0}, 1} = v = 0 \prod K - 2 Pr (ℜ {r_{v} r_{\overset{v}{ˊ}}} > 0) .

P_{C} ∣_{H_{0}, 1} = v = 0 \prod K - 2 Pr (ℜ {r_{v} r_{\overset{v}{ˊ}}} > 0) .

Pr (ℜ {r_{v} r_{\overset{v}{ˊ}}} > 0) = Pr r_{v, \overset{v}{ˊ}}^{SP} r_{v}^{I} r_{\overset{v}{ˊ}}^{I} - r_{v}^{Q} r_{\overset{v}{ˊ}}^{Q} > 0 .

Pr (ℜ {r_{v} r_{\overset{v}{ˊ}}} > 0) = Pr r_{v, \overset{v}{ˊ}}^{SP} r_{v}^{I} r_{\overset{v}{ˊ}}^{I} - r_{v}^{Q} r_{\overset{v}{ˊ}}^{Q} > 0 .

P_{C} ∣_{H_{0}, 1} = v = 0 \prod K - 2 Pr (r_{v, \overset{v}{ˊ}}^{SP} > 0) = v = 0 \prod K - 2 [1 - Q (\frac{2 μ ˉ _{SP}}{σ ˉ _{SP}^{2}})]

P_{C} ∣_{H_{0}, 1} = v = 0 \prod K - 2 Pr (r_{v, \overset{v}{ˊ}}^{SP} > 0) = v = 0 \prod K - 2 [1 - Q (\frac{2 μ ˉ _{SP}}{σ ˉ _{SP}^{2}})]

P_{S} ∣_{H_{0}, 1} = 1 - v = 0 \prod K - 2 [1 - Q (\frac{2 μ ˉ _{SP}}{σ ˉ _{SP}^{2}})]

P_{S} ∣_{H_{0}, 1} = 1 - v = 0 \prod K - 2 [1 - Q (\frac{2 μ ˉ _{SP}}{σ ˉ _{SP}^{2}})]

SEP ∣_{d = 1} = 2 K fold \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} \dots \int_{- \infty}^{\infty} SEP ∣_{H_{0}, d = 1} f_{H_{0}^{I}} (H_{0}^{I}, H_{1}^{I}, \dots, H_{K - 1}^{I}) \times f_{H_{0}^{Q}} (H_{0}^{Q}, H_{1}^{Q}, \dots, H_{K - 1}^{Q}) d H_{0}^{I} d H_{1}^{I} \dots d H_{K - 1}^{I} d H_{0}^{Q} d H_{1}^{Q} \dots d H_{K - 1}^{Q} .

SEP ∣_{d = 1} = 2 K fold \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} \dots \int_{- \infty}^{\infty} SEP ∣_{H_{0}, d = 1} f_{H_{0}^{I}} (H_{0}^{I}, H_{1}^{I}, \dots, H_{K - 1}^{I}) \times f_{H_{0}^{Q}} (H_{0}^{Q}, H_{1}^{Q}, \dots, H_{K - 1}^{Q}) d H_{0}^{I} d H_{1}^{I} \dots d H_{K - 1}^{I} d H_{0}^{Q} d H_{1}^{Q} \dots d H_{K - 1}^{Q} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Direct Data Detection of OFDM Signals Over Wireless Channels

A. Saci, , A. Al-Dweik, , A. Shami A. Saci, A. Al-Dweik and A. Shami are with the Department of Electrical and Computer Engineering, Western University, London, ON, Canada, (e-mail: {asaci, aaldweik, abdallah.shami}@uwo.ca).A. Al-Dweik is also with the Department of Electrical and Computer Engineering, Khalifa University, Abu Dhabi, UAE, (e-mail: [email protected]).Part of this work is protected by the US patent: A. Al-Dweik “Signal detection in a communication system.” U.S. Patent No. 9,596,119. 14 Mar. 2017.

Abstract

This paper presents a novel efficient receiver design for wireless communication systems that incorporate orthogonal frequency division multiplexing (OFDM) transmission. The proposed receiver does not require channel estimation or equalization to perform coherent data detection. Instead, channel estimation, equalization, and data detection are combined into a single operation, and hence, the detector is denoted as a direct data detector ( $D^{3}$ ). The performance of the proposed system is thoroughly analyzed theoretically in terms of bit error rate (BER), and validated by Monte Carlo simulations. The obtained theoretical and simulation results demonstrate that the BER of the proposed $D^{3}$ is only $3$ dB away from coherent detectors with perfect knowledge of the channel state information (CSI) in flat fading channels, and similarly in frequency-selective channels for a wide range of signal-to-noise ratios (SNRs). If CSI is not known perfectly, then the $D^{3}$ outperforms the coherent detector substantially, particularly at high SNRs with linear interpolation. The computational complexity of the $D^{3}$ depends on the length of the sequence to be detected, nevertheless, a significant complexity reduction can be achieved using the Viterbi algorithm.

Index Terms:

OFDM, fading channels, data detection, Viterbi, sequence detection, channel estimation, equalization.

I Introduction

Orthogonal frequency division multiplexing (OFDM) is widely adopted in several wired and wireless communication standards, such as worldwide interoperability for microwave access (WiMAX) technologies [1], Long Term Evolution-Advanced (LTE-A) standard [2], Digital Video Broadcasting (DVB), Terrestrial (DVB-T) and Hand-held (DVB-H) [3], optical wireless communications (OWC) [4], [5], and recently, it has been adopted for the fifth-generation (5G) wireless networks [6]. The channel is typically modeled as frequency-selective for WiMax and LTE-A, flat for OWC in the presence of atmospheric turbulence [4], [5]. Therefore, OFDM has become the lead above other modulation schemes at present and in the near future [7].

One of the main advantages of OFDM is that each subcarrier experiences flat fading even though the overall signal spectrum suffers from frequency-selective fading. Moreover, incorporating the concept of a cyclic prefix (CP), which is formed by copying a part of the OFDM symbol of and pre-append it to the transmitted OFDM block, prevents intersymbol interference (ISI) if the CP length is larger than the maximum delay spread of the channel. Consequently, a low-complexity single-tap equalizer can be utilized to eliminate the impact of the multipath fading channel. Under such circumstances, the OFDM demodulation process can be performed once the fading parameters at each subcarrier, commonly denoted as channel state information (CSI), are estimated.

In general, channel estimation can be classified into blind [8]-[13], and pilot-aided techniques [14]-[20]. Blind channel estimation techniques are spectrally efficient because they do not require any overhead to estimate the CSI, nevertheless, such techniques have not yet been adopted in practical OFDM systems. Conversely, pilot-based CSI estimation is preferred for practical systems, because typically it is more robust and less complex. In pilot-based CSI estimation, the pilot symbols are embedded within the subcarriers of the transmitted OFDM signal in time and frequency domain; hence, the pilots form a two dimensional (2-D) grid [2]. The channel response at the pilot symbols can be obtained using the least-squares (LS) frequency domain estimation, and the channel parameters at other subcarriers can be obtained using various interpolation techniques [21]. Optimal interpolation requires a 2-D Wiener filter that exploits the time and frequency correlation of the channel, however, it is substantially complex to implement [22], [23]. The complexity can be reduced by decomposing the 2-D interpolation process into two cascaded 1-D processes, and then, using less computationally-involved interpolation schemes [24], [25]. Low complexity interpolation, however, is usually accompanied by error rate performance degradation [25]. It is also worth noting that most practical OFDM-based systems utilize a fixed grid pattern structure [2].

Once the channel parameters are obtained for all subcarriers, the received samples at the output of the fast Fourier transform (FFT) are equalized to compensate for the channel fading. Fortunately, the equalization for OFDM is performed in the frequency domain using single-tap equalizers. The equalizer output samples, which are denoted as the decision variables, will be applied to a maximum likelihood detector (MLD) to regenerate the information symbols.

In addition to the direct approach, several techniques have been proposed in the literature to estimate the CSI or detect the data symbols indirectly, by exploiting the correlation among the channel coefficients. For example, the per-survivor processing (PSP) approach has been widely used to approximate the maximum likelihood sequence estimator (MLSE) for coded and uncoded sequences [26], [27], [28]. The PSP utilizes the Viterbi algorithm (VA) to recursively estimate the CSI without interpolation using the least mean squares (LMS) algorithm. Although the PSP provides superior performance when the channel is flat over the entire sequence, its performance degrades severely if this condition is not satisfied, even when the LMS step size is adaptive [27]. Multiple symbol differential detection (MSDD) can be also used for sequence estimation without explicit channel estimation. In such systems, the information is embedded in the phase difference between adjacent symbols, and hence, differential encoding is needed. Although differential detection is only $3$ dB worse than coherent detection in flat fading channels, its performance may deteriorate significantly in frequency-selective channels [29], [30]. Consequently, Wu and Kam [31] proposed a generalized likelihood ratio test (GLRT) receiver whose performance without CSI is comparable to the coherent detector. Although the GLRT receiver is more robust than differential detectors in frequency-selective channels, its performance is significantly worse than coherent detectors.

The estimator-correlator (EC) cross-correlates the received signal with an estimate of the channel output signal corresponding to each possible transmitted signal [32], [33]. The signal at channel output is estimated with a minimum mean square error (MMSE) estimator from the knowledge of the received signal and the second order statistics of the channel and noise. The channel estimation (CE) may provide BER that is about $1$ dB from the ML coherent detector in flat fading channels but at the expense of a large number of pilots. Moreover, the BER performance of EC detectors is generally poor in frequency-selective channels where the CE BER is significantly worse than the ML coherent detector [33]. Decision-directed techniques can also be used to avoid conventional channel estimation. For example, the authors in [10] proposed a hybrid frame structure that enables blind decision-directed channel estimation. Although the proposed system manages to offer reliable channel estimates and BER in various channel conditions, the system structure follows the typical coherent detector design where equalization and symbol detection are required.

I-A Motivation and Key Contributions

Unlike conventional OFDM detectors, this work presents a new detector to regenerate the information symbols directly from the received samples at the FFT output, which is denoted as the direct data detector ( $D^{3})$ . By using the $D^{3}$ , there is no need to perform channel estimation, interpolation, equalization, or symbol decision operations. The $D^{3}$ exploits the fact that channel coefficients over adjacent subcarriers are highly correlated and approximately equal. Consequently, the $D^{3}$ is derived by minimizing the difference between channel coefficients of adjacent subcarriers. The main limitation of the $D^{3}$ is that it suffers from a phase ambiguity problem, which can be solved using pilot symbols, which are part of a transmission frame in most practical standards [1], [2]. To the best of the authors’ knowledge, there is no work reported in the published literature that uses the proposed principle.

The $D^{3}$ performance is evaluated in terms of complexity, computational power, and bit error rate (BER), where analytic expressions are derived for several channel models and system configurations. The $D^{3}$ BER is compared to other widely used detectors such as the maximum likelihood (ML) coherent detector [34] with perfect and imperfect CSI, multiple symbol differential detector (MSDD) [29], the ML sequence detector (MLSD) with no CSI [31], and the per-survivor processing detector [26]. The obtained results show that the $D^{3}$ is more robust than all the other considered detectors in various cases of interest, particularly in frequency-selective channels at moderate and high SNRs. Moreover, the computational power comparison shows that the $D^{3}$ requires less than $35\%$ of the computational power required by the ML coherent detector.

I-B Paper Organization and Notations

The rest of this paper is organized as follows. The OFDM system and channel models are described in Section II. The proposed $D^{3}$ is presented in Section III, and the efficient implementation of the $D^{3}$ is explored in Section IV. The system error probability performance analysis is presented in Section V. Complexity analysis of the conventional pilot based OFDM and the $D^{3}$ are given in Section VI. Numerical results are discussed in Section VII, and finally, the conclusion is drawn in Section VIII.

In what follows, unless otherwise specified, uppercase boldface and blackboard letters such as $\mathbf{H}$ and $\mathbb{H}$ , will denote $N\times N$ matrices, whereas lowercase boldface letters such as $\mathbf{x}$ will denote row or column vectors with $N$ elements. Uppercase, lowercase, or bold letters with a tilde such as $\tilde{d}$ will denote trial values, and symbols with a hat, such as $\hat{\mathbf{x}}$ , will denote the estimate of $\mathbf{x}$ . Letters with apostrophe such as $\acute{v}$ are used to denote the next value, i.e., $\acute{v}\triangleq v+1$ . Furthermore, $\mathrm{E}\left[\cdot\right]$ denotes the expectation operation.

II Signal and Channel Models

Consider an OFDM system with $N$ subcarriers modulated by a sequence of $N$ complex data symbols $\mathbf{d}=[d_{0}$ , $d_{1}$ , $....$ , $d_{N-1}]^{T}$ . The data symbols are selected uniformly from a general constellation such as $M$ -ary phase shift keying (MPSK) or quadrature amplitude modulation (QAM). In conventional pilot-aided OFDM systems [35], $N_{P}$ of the subcarriers are allocated for pilot symbols, which can be used for channel estimation and synchronization purposes. The modulation process in OFDM can be implemented efficiently using an $N$ -point inverse FFT (IFFT) algorithm, where its output during the $\ell$ th OFDM block can be written as $\mathbf{x(\ell)=F}^{H}\mathbf{d(\ell)}$ where $\mathbf{F}$ is the normalized $N\times N$ FFT matrix, and hence, $\mathbf{F}^{H}$ is the IFFT matrix. To simplify the notation, the block index $\ell$ is dropped for the remaining parts of the paper unless it is necessary to include it. Then, a CP of length $N_{\mathrm{CP}}$ samples, no less than the channel maximum delay spread ( $\mathcal{D}_{\mathrm{h}}$ ), is appended to compose the OFDM symbol with a total length $N_{\mathrm{t}}=N+N_{\mathrm{CP}}$ samples and duration of $T_{\mathrm{t}}$ s.

At the receiver front-end, the received signal is down-converted to baseband and sampled at a rate $T_{\mathrm{s}}=T_{\mathrm{t}}/N_{\mathrm{t}}$ . In this work, the channel is assumed to be composed of $\mathcal{D}_{\mathrm{h}}+1$ independent multipath components each of which has a gain $h_{m}\sim\mathcal{CN}\left(0,2\sigma_{h_{m}}^{2}\right)$ and delay $m\times T_{\mathrm{s}}$ , where $m\in\{0$ , $1$ , $...$ , $\mathcal{D}_{\mathrm{h}}\}$ . A quasi-static channel is assumed throughout this work, and thus, the channel taps are considered constant over one OFDM symbol, but they may change over two consecutive symbols. Therefore, the received sequence after dropping the CP samples and applying the FFT can be expressed as,

[TABLE]

where $\left\{\mathbf{r,w}\right\}\in\mathbb{C}^{N\times 1}$ , $w_{v}\sim\mathcal{CN}\left(0\text{, }2\sigma_{w}^{2}\right)$ is the additive white Gaussian noise (AWGN) vector and $\mathbf{\mathbf{H}}$ denotes the channel frequency response (CFR)

[TABLE]

By noting that $\mathbf{r|}_{\mathbf{H,d}}\sim\mathcal{CN}\left(\mathbf{Hd}\text{, }2\sigma_{w}^{2}\mathbf{I}_{N}\right)$ where $\mathbf{I}_{N}$ is an $N\times N$ identity matrix, then it is straightforward to show that the MLD can be expressed as

[TABLE]

where $\left\|\mathbf{\cdot}\right\|$ denotes the Euclidean norm, and $\tilde{\mathbf{d}}=\left[\tilde{d}_{0}\text{, }\tilde{d}_{1}\text{,}\ldots\text{, }\tilde{d}_{N1}\right]^{T}$ denotes the trial values of $\mathbf{d}$ . As can be noted from (3), the MLD requires the knowledge of $\mathbf{\mathbf{H}}$ . Moreover, because (3) describes the detection of more than one symbol, it is typically denoted as maximum likelihood sequence detector (MLSD). If the elements of $\mathbf{d}$ are independent, the MLSD can be replaced by a symbol-by-symbol MLD

[TABLE]

Since perfect knowledge of $\mathbf{H}$ is infeasible, an estimated version of $\mathbf{H}$ , denoted as $\hat{\mathbf{H}}$ , can be used in (3) and (4) instead of $\mathbf{H}$ **. **Another possible approach to implement the detector is to equalize $\mathbf{r}$ , and then use a symbol-by-symbol MLD. Because the considered system is assumed to have no ISI or intercarrier interference (ICI), then a single-tap frequency-domain zero-forcing equalizer can be used. Therefore, the equalized received sequence can be expressed as,

[TABLE]

and

[TABLE]

It is interesting to note that solving (3) does not necessarily require the explicit knowledge of $\mathbf{H}$ under some special circumstances. For example, Wu and Kam [31] noticed that in flat fading channels, i.e., $H_{v}=H$ $\forall v$ , it is possible to detect the data symbols using the following MLSD,

[TABLE]

Although the detector described in (7) is efficient in the sense that it does not require the knowledge of $\mathbf{H}$ , its BER is very sensitive to the channel variations.

III Proposed $D^{3}$ System Model

One of the distinctive features of OFDM is that its channel coefficients over adjacent subcarriers in the frequency domain are highly correlated and approximately equal. The correlation coefficient between two adjacent subcarriers can be defined as

[TABLE]

where $\sigma_{h_{m}}^{2}=\mathrm{E}\left[\left|h_{m}\right|^{2}\right]$ . The difference between two adjacent channel coefficients is

[TABLE]

For large values of $N$ , it is straightforward to show that $\varrho_{f}\rightarrow 1$ and $\Delta_{f}\rightarrow 0$ . Similar to the frequency domain, the time domain correlation defined according to the Jakes’ model can be computed as [36],

[TABLE]

where $J_{0}\left(\cdot\right)$ is the Bessel function of the first kind and [math] order, $f_{d}$ is the maximum Doppler frequency. For large values of $N$ , $2\pi f_{d}T_{\mathrm{s}}\ll 1$ , and hence $J_{0}\left(2\pi f_{d}T_{\mathrm{s}}\right)\approx 1$ , and thus $\varrho_{t}\approx 1$ . Using the same argument, the difference in the time domain $\Delta_{t}\triangleq\mathrm{E}\left[H_{v}^{\ell}-H_{v}^{\acute{\ell}}\right]\approx 0$ . Although the proposed system can be applied in the time domain, frequency domain, or both, the focus of this work is the frequency domain.

Based on the aforementioned properties of OFDM, a simple approach to extract the information symbols from the received sequence $\mathbf{r}$ can be designed by minimizing the difference of the channel coefficients between adjacent subcarriers, which can be expressed as

[TABLE]

As can be noted from (11), the estimated data sequence $\mathbf{\hat{d}}$ can be obtained without the knowledge of $\mathbf{H}$ . Moreover, there is no requirement for the channel coefficients over the considered sequence to be equal, and hence, the $D^{3}$ should perform fairly well even in frequency-selective fading channels. Nevertheless, it can be noted that (11) does not have a unique solution because $\mathbf{d}$ and $-\mathbf{d}$ can minimize (11). To resolve the phase ambiguity problem, one or more pilot symbols can be used as a part of the sequence $\mathbf{d}$ **. **In such scenarios, the performance of the $D^{3}$ will be affected indirectly by the frequency selectivity of the channel because the capability of the pilot to resolve the phase ambiguity depends on its fading coefficient. Another advantage of using pilot symbols is that it will not be necessary to detect the $N$ symbols simultaneously. Instead, it will be sufficient to detect $\mathcal{K}$ symbols at a time, which can be exploited to simplify the system design and analysis.

Using the same approach of the frequency domain, the $D^{3}$ can be designed to work in the time domain as well by minimizing the channel coefficients over two consecutive subcarriers, i.e., two subcarriers with the same index over two consecutive OFDM symbols, which is also applicable to single carrier systems. It can be also designed to work in both time and frequency domains, where the detector can be described as

[TABLE]

where $\mathbf{D}_{\mathcal{L}\text{,}\mathcal{K}}$ is an $\mathcal{L}\times\mathcal{K}$ data matrix, $\mathcal{L}$ and $\mathcal{K}$ are the time and frequency detection window size, and the objective function $J\left(\tilde{\mathbf{D}}\right)$ is given by

[TABLE]

For example, if the detection window size is chosen to be the LTE resource block, then, $\mathcal{L}=14$ and $\mathcal{K=}12$ . Moreover, the system presented in (13) can be extended to the multi-branch receiver scenarios, single-input multiple-output (SIMO) as,

[TABLE]

where $\mathcal{N}$ is the number of receiving antennas.

IV Efficient Implementation of $D^{3}$

It can be noted from (12) and (13) that solving for $\hat{\mathbf{D}}$ , given that $N_{P}$ pilot symbols are used, requires an $M^{\mathcal{K}\mathcal{L-}N_{P}}$ trials if brute force search is adopted, which is prohibitively complex, and thus, reducing the computational complexity is crucial. Towards this goal, the two dimensional (2-D) resource block (RB) can be divided into a number of one-dimensional (1-D) segments in time and frequency domains in order to reduce the complexity from order $\mathcal{O}\left(M^{\mathcal{K}\times\mathcal{L-}N_{P}}\right)$ to $\mathcal{O}\left(M\mathcal{\times\left(\mathcal{\mathcal{K}L-}\mathit{N_{P}}\right)}\right)$ . In order words, the time complexity evolves exponentially as the detection size increases in the 2-D block, while it grows linearly in the cascaded 1-D block, which is significant complexity reduction. Fig. 1 shows an example of decomposing the 2-D LTE-A RB into several 1-D segments over time and frequency. As can be noted from the figure, the RB consists of $168$ subcarriers among which $8$ subcarriers are pilots. It is worth noting that there are some rows and columns in the RB that do not have pilots, and thus, the detection of the entire block can be performed as described in Subsection IV-B.

IV-A The Viterbi Algorithm (VA)

By noting that the expression in (11) corresponds to the sum of correlated terms, which can be modeled as a first-order Markov process, then MLSD techniques such as the VA can be used to implement the $D^{3}$ efficiently. For example, the trellis diagram of the VA with binary phase shift keying (BPSK) is shown in Fig. 2, and can be implemented as follows:

Initialize the path metrics $\left\{\Gamma_{0}^{U},\acute{\Gamma}_{0}^{U},\Gamma_{0}^{L},\acute{\Gamma}_{0}^{L}\right\}=0$ , where $U$ and $L$ denote the upper and lower branches, respectively. Since BPSK is used, the number of states is $2$ . 2. 2.

Initialize the counter, $c=0$ . 3. 3.

Compute the branch metric $J_{m,n}^{c}=\left|\frac{rc}{m}-\frac{r_{\acute{c}}}{n}\right|^{2}$ , where $m$ is current symbol index, $m=0\rightarrow\tilde{d}=-1$ , and $m=1\rightarrow\tilde{d}=1$ , and $n$ is the next symbol index using the same mapping as $m$ . 4. 4.

Compute the path metrics using the following rules,

[TABLE] 5. 5.

Track the surviving paths, $2$ paths in the case of BPSK. 6. 6.

Increase the counter, $c=c+1$ . 7. 7.

if $c=\mathcal{K}$ , the algorithm ends. Otherwise, go to step 3.

It is worth mentioning that placing a pilot symbol at the edge of a segment terminates the trellis. To simplify the discussion, assume that the pilot value is $-1$ , and thus we compute only $J_{0,0}$ and $J_{1,0}$ . Consequently, long data sequences can be divided into smaller segments bounded by pilots, which can reduce the delay by performing the detection over the sub-segments in parallel without sacrificing the error rate performance.

IV-B Resource Block Detection

As can be noted from Fig. 1, the segmentation process can be applied directly to any row or column given that has one or more pilots. Nevertheless, there are some rows and columns that do not have pilots. In such scenarios, the detection, for example, can be performed in two steps as follows:

Detect all rows (frequency domain subcarriers) with pilots, i.e., rows 1, 5, 8 and 12. 2. 2.

As a result of the first step, each column (time domain subcarrier) has either pilots, data symbols whose values are known as a result of the detection in the first step, or both, as in the case of columns 1, 4, 7 and 10. Therefore, all remaining subcarriers can be detected using the symbols detected in the first step.

It is worth noting that the number and distribution of the pilot symbols in the RB impact the error rate performance, power and spectral efficiency of the system. For example, the first frequency segment shown in Fig. 1 consists of seven subcarriers, two of them are allocated for pilots. By defining the throughput, or the spectral efficiency, as the ratio of the number of information symbols to the total number of symbols per segment, then the throughput of the first frequency and time segments in Fig. 1 is about 83.3% and 85.7%, respectively. Nevertheless, the system throughput is determined by the total number of pilots and information subcarriers within an RB rather than a segment. By noting that there are only eight pilots among the 168 resource elements , then the throughput loss is about $4.7\%$ and the throughput is about 95.2%. The same argument can be applied to the power efficiency of the system where 4.7% of the power will be allocated to pilots.

IV-C System Design with an Error Control Coding

Forward error correction (FEC) coding can be integrated with the $D^{3}$ in two ways, based on the decoding process, i.e., hard or soft decision decoding. For the hard decision decoding, the integration of FEC coding is straightforward where the output of the $D^{3}$ is applied directly to the hard decision decoder (HDD).

For the soft decision decoding, we can exploit the coded data to enhance the performance of the $D^{3}$ , and then use the $D^{3}$ output to estimate the channel coefficients in a decision-directed manner. The $D^{3}$ with coded data can be expressed as

[TABLE]

where $\mathbb{U}$ is the set of all codewords modulated using the same modulation used at the transmitter. Therefore, the trial sequences $\tilde{\mathbf{u}}$ are restricted to particular sequences. For the case of convolutional codes, the detection and decoding processes can be integrated smoothly since both of them are using the VA. Such an approach can be adopted with linear block codes as well because trellis-based decoding can be also applied to block codes [37].

V Error Rate Analysis of the $D^{3}$

The system BER analysis is presented for several cases according to the pilot and data arrangements. For simplicity, each case is discussed in separate subsections. To make the analysis tractable, we consider BPSK modulation in the analysis while the BER of higher-order modulations is obtained via Monte Carlo simulations.

V-A Single-Sided Pilot

To detect a data segment that contains $\mathcal{K}$ symbols, at least one pilot symbol should be part of the segment in order to resolve the phase ambiguity problem. Consequently, the analysis in this subsection considers the case where there is only one pilot within the $\mathcal{K}$ symbols, as shown in Fig. 4. Given that the FFT output vector $\mathbf{r}=\left[r_{0}\text{, }r_{1}\text{,}\ldots,r_{N-1}\right]$ is divided into $L$ segments each of which consists of $\mathcal{K}$ symbols, including the pilot symbol, then the frequency domain $D^{3}$ detector can be written as,

[TABLE]

where $l$ denotes the index of the first subcarrier in the segment, and without loss of generality, we consider that $l=0$ . Therefore, by expanding (16) we obtain,

[TABLE]

which can be simplified to,

[TABLE]

For BPSK, $\left|r_{v}/\tilde{d_{v}}\right|^{2}=\left|r_{v}\right|^{2}$ , which is a constant term with respect to the maximization process in (18), and thus, they can be dropped. Therefore, the detector is reduced to

[TABLE]

Given that the pilot symbol is placed in the first subcarrier and noting that $d_{v}\in\left\{-1,1\right\}$ , then $\tilde{d_{0}}=1$ and $\hat{\mathbf{d}}_{0}$ can be written as

[TABLE]

The sequence error probability ( $P_{S}$ ), conditioned on the channel frequency response over the $\mathcal{K}$ symbols ( $\mathbf{H}_{0})$ and the transmitted data sequence $\mathbf{d}_{0}$ can be defined as,

[TABLE]

which can be also written in terms of the conditional probability of correct detection $P_{C}$ as,

[TABLE]

Without loss of generality, we assume that $\mathbf{d}_{0}\mathbf{=}[1$ , $1$ ,… $,1]\triangleq\mathbf{1}$ . Therefore,

[TABLE]

Since $\mathbf{d}_{0}$ has $\mathcal{K-}1$ data symbols, then there are $2^{\mathcal{K-}1}$ trial sequences, $\tilde{\mathbf{d}}_{0}^{(0)}$ , $\tilde{\mathbf{d}}_{0}^{(1)}$ , $\ldots$ , $\tilde{\mathbf{d}}_{0}^{(\psi)}$ , where $\psi=2^{\mathcal{K-}1}-1$ , and $\tilde{\mathbf{d}}_{0}^{(\psi)}\mathbf{=}[1$ , $1$ ,… $,1]$ . The first symbol in every sequence is set to $1$ , which is the pilot symbol. By defining $\sum_{v=0}^{\mathcal{K-}2}\frac{\Re\left\{r_{v}r_{\acute{v}}\right\}}{\tilde{d_{v}}\tilde{d}_{\acute{v}}}\triangleq A_{n}$ , where $\tilde{d_{v}}\tilde{d}_{\acute{v}}\in\tilde{\mathbf{d}}_{0}^{(n)}$ , then (23) can be written as,

[TABLE]

which, as depicted in Appendix I, can be simplified to

[TABLE]

To evaluate $P_{C}|_{\mathbf{H}_{0},\mathbf{\mathbf{1}}}$ given in (25), it is necessary to compute $\Pr\left(\Re\left\{r_{v}r_{\acute{v}}\right\}>0\right)$ , which can be written as

[TABLE]

Given that $\mathbf{d}_{0}\mathbf{=}[1$ , $1$ ,… $,1]$ , then $r_{v}^{I}=\Re\left\{r_{v}\right\}=H_{v}^{I}+w_{v}^{I}$ and $r_{v}^{Q}=\Im\left\{r_{v}\right\}=H_{v}^{Q}+w_{v}^{Q}$ . Therefore, $r_{v}^{I},$ $r_{v}^{Q}$ , $r_{\acute{v}}^{I}$ and $r_{\acute{v}}^{Q}$ are independent conditionally Gaussian random variables with averages $H_{v}^{I}$ , $H_{v}^{Q}$ , $H_{\acute{v}}^{I}$ and $H_{\acute{v}}^{Q}$ , respectively, and the variance for all elements is $\sigma_{w}^{2}$ . To derive the PDF of $r_{v,\acute{v}}^{\mathrm{SP}}$ , the PDFs of $r_{v}^{I}r_{\acute{v}}^{I}$ and $r_{v}^{Q}r_{\acute{v}}^{Q}$ should be evaluated, where each of which corresponds to the product of two Gaussian random variables. Although the product of two Gaussian variables is not usually Gaussian, the limit of the moment-generating function of the product has Gaussian distribution. Therefore, the product of two variables $X\sim\mathcal{N}(\mu_{x},\sigma_{x}^{2})$ and $Y\sim\mathcal{N}(\mu_{y},\sigma_{y}^{2})$ tends to be $\mathcal{N}(\mu_{x}\mu_{y},\mu_{x}^{2}\sigma_{y}^{2}+\mu_{y}^{2}\sigma_{x}^{2})$ as the ratios $\mu_{x}/\sigma_{x}$ and $\mu_{y}/\sigma_{y}$ increase [38]. By noting that in in (26) $\mathrm{E}\left[r_{y}^{x}\right]=H_{y}^{x}$ , $x\in\left\{I,Q\right\}$ and $y\in\left\{v,\acute{v}\right\}$ and $\sigma_{r_{y}^{x}}=\sigma_{w}$ , thus $\mathrm{E}\left[r_{y}^{x}\right]/\sigma_{r_{y}^{x}}\gg 1$ $\forall\left\{x,y\right\}$ . Moreover, because the PDF of the sum or difference of two Gaussian random variables is also Gaussian, then, $r_{v,\acute{v}}^{\mathrm{SP}}\sim\mathcal{N}\left(\bar{\mu}_{\mathrm{SP}},\bar{\sigma}_{\mathrm{SP}}^{2}\right)$ where $\bar{\mu}_{\mathrm{SP}}=H_{v}^{I}H_{\acute{v}}^{I}+H_{v}^{Q}H_{\acute{v}}^{Q}$ and $\bar{\sigma}_{\mathrm{SP}}^{2}=\sigma_{w}^{2}\left(\left|H_{v}\right|^{2}+\left|H_{\acute{v}}\right|^{2}+\sigma_{w}^{2}\right)$ . Consequently,

[TABLE]

and

[TABLE]

where $Q\left(x\right)\triangleq\frac{1}{\sqrt{2\pi}}\int_{x}^{\infty}\exp\left(-\frac{t^{2}}{2}\right)dt$ . Since $H_{v}^{I}$ and $H_{v}^{Q}$ are independent, then, the condition on $\mathbf{H}_{0}$ in (28) can be removed by averaging $P_{S}$ over the PDF of $\mathbf{H}_{0}^{I}$ and $\mathbf{H}_{0}^{Q}$ as,

[TABLE]

Because the random variables $H_{i}^{I}$ and $H_{i}^{Q}$ $\forall i$ in (29) are real and Gaussian, their PDFs are multivariate Gaussian distributions [34],

[TABLE]

where $\boldsymbol{\mu}$ is the mean vector, which is defined as,

[TABLE]

and $\boldsymbol{\Sigma}$ is the covariance matrix, $\boldsymbol{\Sigma}=\mathrm{E}\left[\left(\mathbf{X}-\mu\right)\left(\mathbf{X}-\mu\right)^{T}\right].$

Due to the difficulty of evaluating $2\mathcal{K}$ integrals, we consider the special case of flat fading, which implies that $H_{v}=H_{\acute{v}}\triangleq H$ and $\left(H^{I}\right)^{2}+\left(H^{Q}\right)^{2}\triangleq\alpha^{2}$ , where $\alpha$ is the channel fading envelope, $\alpha=\left|H\right|$ . Therefore, the SEP expression in (28) becomes,

[TABLE]

Recalling the Binomial Theorem, we get

[TABLE]

Then the SEP formula in (32) using the Binomial Theorem in (33) can be written as,

[TABLE]

The conditioning on $\alpha$ can be removed by averaging over the PDF of $\alpha$ , which is Rayleigh. Therefore,

[TABLE]

And hence,

[TABLE]

Because the expression in (32) contains high order of $Q$ -function $Q^{n}\left(x\right)$ , evaluating the integral analytically becomes intractable for $\mathcal{K}>2$ . For the special case of $\mathcal{K}=2$ , $P_{S}$ can be evaluated by substituting (34) and (35) into (36) and evaluating the integral yields the following simple expression,

[TABLE]

where $\bar{\gamma}_{s}$ is the average signal-to-noise ratio (SNR). Moreover, because all data sequences have an equal probability of error, then $P_{S}|_{\mathbf{1}}=P_{S}$ , which also equivalent to the bit error rate (BER). It is interesting to note that (37) is similar to the BER of the differential binary phase shift keying (DBPSK) [34]. However, the two techniques are essentially different as $D^{3}$ does not require differential encoding, has no constraints on the shape of the signal constellation, and performs well even in frequency-selective fading channels.

To evaluate $P_{S}$ for $\mathcal{K}>2$ , we use an approximation for $Q\left(x\right)$ in [39], which is given by

[TABLE]

Therefore, by substituting (38) into the conditional SEP (34) and averaging over the Rayleigh PDF (35), the evaluation of the SEP becomes straightforward. For example, evaluating the integral for $\mathcal{K}=3$ gives,

[TABLE]

where $\mathrm{Ei}\left(x\right)$ is the exponential integral (EI), $\mathrm{Ei}\left(x\right)\triangleq-\int_{-x}^{\infty}\frac{e^{-t}}{t}dt$ . Similarly, $P_{S}$ for $\mathcal{K}=7$ can be evaluated to,

[TABLE]

Although the SEP is a very useful indicator for the system error probability performance, the BER is actually more informative. For a sequence that contains $\mathcal{K}_{D}$ information bits, the BER can be expressed as $P_{B}=\frac{1}{\Lambda}P_{S}$ , where $\Lambda$ denotes the average number of bit errors given a sequence error, which can be defined as

[TABLE]

Because the SEP is independent of the transmitted data sequence, then, without loss of generality, we assume that the transmitted data sequence is $\mathbf{d}_{0}^{(0)}$ . Therefore,

[TABLE]

where $\left\|\mathbf{\hat{d}}_{0}\right\|^{2}$ , in this case, corresponds to the Hamming weight of the detected sequence $\mathbf{\hat{d}}_{0}$ , which can be expressed as

[TABLE]

where $\mathbf{d}_{0}^{(0)}\rightarrow\mathbf{d}_{0}^{(i)}$ denotes the pairwise error probability (PEP). By noting that $\Pr\left(\mathbf{d}_{0}^{(0)}\rightarrow\mathbf{d}_{0}^{(i)}\right)\neq\Pr\left(\mathbf{d}_{0}^{(0)}\rightarrow\mathbf{d}_{0}^{(j)}\right)$ $\forall i\neq j$ , then deriving the PEP for all cases of interest is intractable. As an alternative, a simple approximation is derived.

For a sequence that consists of $\mathcal{K}_{D}$ information bits, the BER is bounded by

[TABLE]

In practical systems, the number of bits in the detected sequence is generally not large, which implies that the upper and lower bounds in (44) are relatively tight, and hence, the BER can be approximated as the middle point between the two bounds as,

[TABLE]

The analysis of the general $1\times\mathcal{N}$ SIMO system is a straightforward extension of the single-input single-output (SISO) case. To simplify the analysis, we consider the flat channel case where the conditional SEP can be written as,

[TABLE]

Given that all the receiving branches are independent, the fading envelopes will have Rayleigh distribution $\alpha_{i}\sim\mathcal{R}\left(2\sigma_{H}^{2}\right)$ $\forall i$ , and thus, $\sum_{i=1}^{\mathcal{N}}\alpha_{i}^{2}\triangleq a$ will have Gamma distribution, $a\sim\mathcal{G}\left(\mathcal{N},2\sigma_{H}^{2}\right)$ ,

[TABLE]

Therefore, the unconditional SEP can be evaluated as,

[TABLE]

For the special case of $\mathcal{N=}2$ , $\mathcal{K}=2$ , $P_{S}$ can be evaluated as,

[TABLE]

where $\varkappa\triangleq\sqrt{2+\bar{\gamma}_{s}}.$ Computing the closed-form formulas for other values of $\mathcal{N}$ and $\mathcal{K}$ can be evaluated following the same approach used in the SISO case.

V-B Double-Sided Pilot

Embedding more pilots in the detection segment can improve the detector’s performance. Consequently, it worth investigating the effect of embedding more pilots in the SEP analysis. More specifically, we consider double-sided segment, $\tilde{d}_{0}=1$ , $\tilde{d}_{\mathcal{K}-1}=1$ , as illustrated in Fig. 4. In this case, the detector can be expressed as,

[TABLE]

From the definition in (50), the probability of receiving the correct sequence can be derived based on the reduced number of trials as compared to (20). Therefore,

[TABLE]

which, similar to the single-sided case, can be written as,

[TABLE]

Therefore,

[TABLE]

For flat fading channels, the SEP expression in (53) can be simplified by following the same procedure in Subsection V-A, for the special case of $\mathcal{K}=3$ , the SEP becomes,

[TABLE]

For $\mathcal{K}>3$ , the approximation of $Q^{n}\left(x\right)$ , as illustrated in Subsection V-A, can be used in (53) to average over the PDF in (35). For example, the case $\mathcal{K}=4$ can be evaluated as,

[TABLE]

For $\mathcal{K}=6$ ,

[TABLE]

For the double-sided pilot, $P_{B}=P_{S}$ for the case of $\mathcal{K}=3$ , while it can be computed using (45) for $\mathcal{K}>3$ .

VI Complexity Analysis

The computational complexity is evaluated as the total number of primitive operations needed to perform the detection. The operations that will be used are the number of real additions ( $R_{A}$ ), real multiplications ( $R_{M}$ ), and real divisions ( $R_{D}$ ) required to produce the set of detected symbols $\hat{\mathbf{d}}$ for each technique. It worth noting that one complex multiplication ( $C_{M}$ ) is equivalent to four $R_{M}$ and three $R_{A}$ operations, while one complex addition ( $C_{A}$ ) requires two $R_{A}$ . To simplify the analysis, we first assume that constant modulus (CM) constellations such as MPSK is used, then, we evaluate the complexity for higher-order modulation such as quadrature amplitude modulation (QAM) modulation.

VI-A Complexity of Conventional OFDM Detectors

The complexity of the conventional OFDM receiver that consists of the following main steps with the corresponding computational complexities:

Channel estimation of the pilot symbols, which computes $\hat{H}_{k}$ at all pilot subcarriers. Assuming that the pilot symbol $d_{k}$ is selected from a CM constellation, then $\hat{H}_{k}=r_{k}d_{k}^{*}$ and hence, $N_{P}$ complex multiplications are required. Therefore, $R_{A}^{\left(1\right)}=4N_{P}$ and $R_{M}^{\left(1\right)}=4N_{P}$ . 2. 2.

Interpolation, which is used to estimate the channel at the non-pilot subcarriers. The complexity of the interpolation process depends on the interpolation algorithm used. For comparison purposes, we assume that linear interpolation is used, which is the least complex interpolation algorithm. The linear interpolation requires one complex multiplication and two complex additions per interpolated sample. Therefore, the number of complex multiplications required is $N-N_{P}$ and the number of complex additions is $2\left(N-N_{P}\right)$ . And hence, $R_{A}^{\left(2\right)}=7\left(N-N_{P}\right)$ and $R_{M}^{\left(2\right)}=4\left(N-N_{P}\right)$ . 3. 3.

Equalization, a single-tap equalizer requires $N-N_{P}$ complex division to compute the decision variables $\check{r}_{k}=\frac{r_{k}}{\hat{H}_{k}}=r_{k}\frac{\hat{H}_{k}^{*}}{\left|\hat{H}_{k}^{*}\right|^{2}}$ . Therefore, one complex division requires two complex multiplications and one real division. Therefore, $R_{A}^{\left(3\right)}=6\left(N-N_{P}\right)$ , $R_{M}^{\left(3\right)}=8\left(N-N_{P}\right)$ and $R_{D}^{\left(3\right)}=\left(N-N_{P}\right)$ . 4. 4.

Detection, assuming symbol-by-symbol minimum distance detection, the detector can be expressed as $\hat{d}_{k}=\arg\min_{\tilde{d}_{i}}J\left(\tilde{d}_{i}\right),\,\,\forall i\in\left\{0,1,\dots,M-1\right\}$ where $J\left(\tilde{d}_{i}\right)=\left|\check{r}_{k}-\tilde{d}_{i}\right|^{2}$ . Assuming CM modulation is used, expanding the cost function and dropping the constant terms we can write $J\left(\tilde{d}_{k}\right)=-\check{r}_{k}\tilde{d}_{k}^{*}-\check{r}_{k}^{*}\tilde{d}_{k}$ . We can also drop the minus sign from the cost function, and thus, the objective becomes maximizing the cost function $\hat{d}_{k}=\arg\min_{\tilde{d}_{i}}J\left(\tilde{d}_{i}\right)$ . Since the two terms are complex conjugate pair, then $-\check{r}_{k}\tilde{d}_{k}^{*}-\check{r}_{k}^{*}\tilde{d}_{k}=2\Re\left\{\check{r}_{k}\tilde{d}_{k}^{*}\right\}$ , and thus we can write the detected symbols as,

[TABLE]

Therefore, the number of real multiplications required for each information symbol is $2M$ , and the number of additions is $M$ . Therefore, $R_{A}^{\left(4\right)}=\left(N-N_{P}\right)M$ and $R_{M}^{\left(4\right)}=2\left(N-N_{P}\right)M$ .

Finally, the total computational complexity per OFDM symbol can be obtained by adding the complexities of the individual steps $1\rightarrow 4$ , as:

[TABLE]

For higher modulation orders, such as QAM, the complexity of the conventional OFDM receivers considering addition division operations is computed following the same steps $1\rightarrow 4$ above, and found to be as:

[TABLE]

VI-B Complexity of the $D^{3}$

The complexity of the $D^{3}$ based on the VA is mostly determined by the branch and path metrics calculation. The branch metrics can be computed as

[TABLE]

For CM constellation, the first and last terms are constants, and hence, can be dropped. Therefore,

[TABLE]

By noting that the two terms in (65) are the complex conjugate pair, then

[TABLE]

From the expression in (66), the constant “ $-2$ ” can be dropped from the cost function, however, the problem with be flipped to a maximization problem. Therefore, by expanding (66), we get,

[TABLE]

By defining $\tilde{d}_{m}\tilde{d}_{n}^{\ast}\triangleq\tilde{u}_{m,n},$ and using complex numbers identities, we get (68),

[TABLE]

For CM, $\Re\left\{\tilde{u}_{m,n}\right\}^{2}+\Im\left\{\tilde{u}_{m,n}\right\}^{2}$ is constant, and hence, it can be dropped from the cost function, which implies that no division operations are required.

To compute $J_{m,n}^{c}$ , it is worth noting that the two terms in brackets are independent of $\left\{m,n\right\}$ , and hence, they are computed only once for each value of $c$ . Therefore, the complexity at each step in the trellis can be computed as $R_{A}=3\times 2^{M}$ , $R_{M}=4+2\times 2^{M}$ and $R_{D}=0$ , where $2^{M}$ is the number of branches at each step in the trellis. However, if the trellis starts or ends by a pilot, then only $M$ computations are required. By noting that the number of full steps is $N-2N_{P}-1$ , and the number of steps that require $M$ computations is $2\left(N_{P}-1\right)$ , then the total computations of the branch metrics (BM) are:

[TABLE]

The path metrics (PM) require $R_{A}^{PM}=\left(N-2N_{P}-1\right)+M\left(N_{P}-1\right)$ real addition. Therefore, the total complexity is:

[TABLE]

For QAM modulation, the most general case for the branch metrics of the $D^{3}$ will be used as,

[TABLE]

The branch metric in (72) requires one complex addition, $C_{A}=1$ , one complex multiplication, $C_{M}=1$ , and two complex divisions, $C_{D}=2$ , per branch metrics. Therefore, the total path metric complexity is:

[TABLE]

To compare the complexity of the $D^{3}$ , we use the conventional detector using LS channel estimation, linear interpolation, zero-forcing (ZF) equalization, and MLD, denoted as coherent-L, as a benchmark due to its low complexity. The relative complexity is denoted by $\eta$ , which corresponds to the ratio of the $D^{3}$ complexity to the conventional detector, i.e., $\eta_{R_{A}}$ denotes the ratio of real additions and $\eta_{R_{M}}$ corresponds to the ratio of real multiplications. As depicted in Table I, $R_{A}$ for $D^{3}$ less than coherent-L only using BPSK for $N=128$ , and then it becomes larger for all the other considered values of $N$ . For $R_{M}$ , $D^{3}$ is always less than the coherent-L, particularly for high values of $N$ , where it becomes 0.61 for $N=2048$ . It is worth noting that $R_{D}$ in the table corresponds to the number of divisions in the conventional OFDM since the $D^{3}$ does not require any division operations. For a more informative comparison between the two systems, we use the computational power analysis presented in [40], where the total power for each detector is estimated based on the total number of operations. Table I shows the relative computational power $\eta_{P}$ , which shows that the $D^{3}$ detector requires only $0.2$ of the power required by the coherent-L detector for $N=128$ and $0.31\%$ for $N=2048$ .

It is also worth considering the complexity analysis for higher modulation orders that require division operations such as 16-QAM and 64-QAM since they widely used in modern wireless broadband systems [1], [2]. Table II shows the rations of real multiplications, multiplications, divisions, and lastly the ration of the overall computational power for 16-QAM and 64-QAM considering $N=512$ and $N=2048$ . Unlike the CM modulus case, the $D^{3}$ requires division operations, where it is very comparable to conventional OFDM receivers in terms of the division computational resources. Although, the total number of computational addition resources needed is higher in $D^{3}$ by $25\%-65\%$ , Nevertheless, the overall computational resources in $D^{3}$ is less than the conventional OFDM reveries by $\%6-20\%$ due to the significant saving in the multiplication operations of the $D^{3}$ .

Besides, it is worth noting that linear interpolation has lower complexity as compared to more accurate interpolation schemes such as the spline interpolation [41], [42], which comes at the expense of the error rate performance. Therefore, the results presented in Table I can be generally considered as upper bounds on the relative complexity of the $D^{3}$ , when more accurate interpolation schemes are used, the relative complexity will drop even further as compared to the results in Table I.

VI-C Complexity with Error Correction Coding

To evaluate the impact of the complexity reduction of the $D^{3}$ in the presence of FEC coding, convolutional codes are considered with soft and hard decision decoding using the VA. BPSK is the modulation considered for the complexity evaluation and the code rate is assumed to be $1/2$ . For decoding of convolutional codes, the soft VA requires $n\times 2^{K}$ addition or subtractions and multiplications per decoded bit, where $1/n$ is the code rate and $K$ is the constraint length [43]. Therefore, for $1/2$ code rate, $R_{A}=R_{M}=2^{K+1}$ . Given that each OFDM symbol has $N$ coded bits and $N/2$ information bits, the complexity per OFDM symbol becomes $R_{A}=R_{M}=N\times 2^{K}$ . For the hard VA, $N\times 2^{K}$ XOR operations are required for the branch metric computation, while $N\times 2^{K-1}$ additions are required for the path metric computations. Because the XOR operation is a bit operation, it’s complexity is much less than the addition. Assuming that addition is using an 8-bit representation, then the complexity of an addition operation is about eight times the XOR. Therefore, $R_{A}$ , in this case, can be approximated as $N\left(2^{K}+2^{K-2}\right)$ .

As can be noted from Table III, the complexity reduction when soft VA is used less significant as compared to the hard VA. Such a result is obtained because the soft VA requires the CSI to compute the reliability factors, which requires $N-N_{P}$ division operations when the $D^{3}$ is used. For hard decoding, the advantage of the $D^{3}$ is significant even for high constraint length values.

VII Numerical Results

This section presents the performance of the $D^{3}$ detector in terms of BER for several operating scenarios. The system model follows the LTE-A physical layer (PHY) specifications [2], where the adopted OFDM symbol has $N=512$ , $N_{\mathrm{CP}}=64$ , the sampling frequency $f_{s}=7.68$ MHz, the subcarrier spacing $\Delta f=15$ kHz, and the pilot grid follows that of Fig. 1. The total OFDM symbol period is $75$ $\mu\sec$ , and the CP period is $4.69$ $\mu\sec$ . The channel models used are the flat Rayleigh fading channel, the typical urban (TUx) multipath fading model [44] that consists of $6$ taps with normalized delays of $\left[0,2,3,9,13,29\right]$ and average taps gains are $\left[0.2,0.398,0.2,0.1,0.063,0.039\right]$ , which corresponds to a severe frequency-selective channel. The TUx model is also used to model a moderate frequency-selective channel where the number of taps in the channel is $9$ with normalized delays of $[0$ , $1$ , $\ldots$ , $8]$ samples, and the average taps gains are $[0.269$ , $0.174$ , $0.289$ , $0.117$ , $0.023$ , $0.058$ , $0.036$ , $0.026$ , $0.008]$ . The channel taps gains are assumed to be independent and Rayleigh distributed. The Monte Carlo simulation results included in this work are obtained by generating $10^{6}$ OFDM symbols per simulation run. Throughout this section, the ML coherent detector with perfect CSI will be denoted as coherent, while the coherent with linear and spline interpolation will be denoted as coherent-L and coherent-S, respectively. Moreover, the results are presented for the SISO system, $\mathit{\mathcal{N}\mathrm{=1}}$ , unless it is mentioned otherwise. The SNR in the obtained results is defined as the ratio of the average received signal power to the average noise power regardless of the number of pilots. Such an approach is followed because the proposed system in this work is evaluated in the context of the LTE RB, which has a fixed structure. For more general comparisons, the power and spectral efficiency of all considered systems should be identical.

Fig. 6 shows the BER of the single-sided (SS) and double-sided (DS) $D^{3}$ over flat fading channels for $\mathcal{K}=2,6$ and $3,7$ , respectively, and using BPSK. The number of data symbols $\mathcal{K}_{D}=\mathcal{K}-1$ for the SS and $\mathcal{K}_{D}=\mathcal{K}-2$ for the DS because there are two pilot symbols at both ends of the data segment for the DS case. The results in the figure for the SS show that $\mathcal{K}$ has a noticeable impact on the BER where the difference between the $\mathcal{K}=2$ and $6$ cases is about $1.6$ dB at BER of $10^{-3}$ . For the DS segment, the BER has the same trends of the SS except that it becomes closer to the coherent case because having more pilots reduces the probability of sequence inversion due to the phase ambiguity problem. The figure shows that the approximated and simulation results match very well for all cases, which confirms the accuracy of the derived approximations.

The effect of the frequency selectivity is illustrated in Fig. 6 for the SS and DS configurations using $\mathcal{K}_{D}=1$ . As can be noted from the figure, frequency-selective channels introduce error floors at high SNRs, which is due to the difference between adjacent channel values caused by the channel frequency selectivity. Furthermore, the figure shows a close match between the simulation and the derived approximations. The approximation results are presented only for $\mathcal{K}=2$ because evaluating the BER for $\mathcal{K}>2$ becomes computationally prohibitive. For example, evaluating the integral (29) for the $\mathcal{K}=3$ requires solving a $6$ -fold integral. The results for the frequency-selective channels are quite different from the flat fading cases. In particular, the BER performance drastically changes when the DS pilot segment is used. Moreover, the impact of the frequency selectivity is significant, particularly for the SS pilot case.

Fig. 8 shows the BER of the $1\times 2$ SIMO $D^{3}$ over flat fading channels for SS and DS pilot segments. It can be noted from the figure that the maximum ratio combiner (MRC) BER with perfect CSI outperforms the DS and SS systems by about $2$ and $3$ dB, respectively. Moreover, the figure shows that the MLSD [31] and the $D^{3}$ have equivalent BER for the SISO and SIMO scenarios. The figure also shows the BER of the 1×2 SIMO systems as compared to the SISO case.

Figs. 8 shows the BER of the SISO and $1\times 2$ SIMO MLSD, coherent, coherent-S and coherent-L systems over frequency-selective channels. For both SISO and SIMO, the BER of all the considered techniques converges at low SNRs because the AWGN dominates the BER in the low SNR range. For moderate and high SNRs, the $D^{3}$ outperforms all the other considered techniques except for the coherent, where the difference is about $3.5$ and $2.75$ dB at BER of $10^{-3}$ for the SISO and SIMO systems, respectively.

Fig. 10 compares the BER of the $D^{3}$ , PSP [26], MLSD [31], MSDD [29], and the coherent detector over the 6-taps channel using BPSK. As can be noted from the figure, the $D^{3}$ noticeably outperforms all other detectors for $\mathrm{SNR}\gtrsim 15$ dB, which indicates that the $D^{3}$ is more robust to the frequency selectivity of the channel. Moreover, the figure shows the $\mathit{D^{\mathrm{3}}}$ BER using VA which, as expected, is identical to the BER obtained using (11). It is worth noting that all the systems considered in the figure are implemented using the DS segment where $\mathcal{K}=7$ , and thus, they are evaluated under similar throughput conditions. However, the BER sensitivity of each technique to the number of pilot symbols could be different from other techniques, which implies that some of these techniques might be able to provide roughly the same BER but using fewer pilot symbols. The same argument applies to the power efficiency as well, because the power allocated per information bit becomes different for various systems. However, because the LTE RB is used as the basis for testing all systems, then the current comparison can be considered generally fair. In the worst case scenario, i.e., considering that all other systems are fully blind, then the throughput power loss is only 4.7% as described in Subsection IV-B, which has a negligible effect on the BER.

Fig. 10 shows the BER for the $D^{3}$ , MLSD [31], coherent, coherent-L and coherent-S using $16$ -QAM. As can be noted from the figure, the MLSD slightly outperforms the $D^{3}$ at low SNRs, and the coherent-S outperforms the $D^{3}$ at high SNRs. However, the coherent-S has generally much higher complexity.

Fig. 12 shows the simulated BER of the $D^{3}$ system when it is used to detect a complete RB as described in Subsection IV-B. The channel model is similar to the 6-taps used described above, and the channel gain variation over consecutive OFDM symbols is generated using the Jakes’s model, where the maximum Doppler frequency $f_{d}=\frac{V}{c}\,f_{c}$ , where $V$ is the speed of the vehicle, $c$ is the speed of light, $c=3\times 10^{8}$ m/s, and the carrier $f_{c}=1.9$ GHz. The channel is considered quasi-static, i.e., the channel remains constant over the OFDM symbol period, but changes over consecutive symbols. As the figure indicates, the $D^{3}$ is more immune to channel mobility at $50$ km/h as compared to pilot-based systems as it did not have an error floor. For the high mobility case, $V=300$ km/h, the $D^{3}$ BER exhibited an error floor at about $6\times 10^{-4}$ , which is much lower than the error floor of the coherent detector with linear and spline interpolation.

Fig. 12 shows the simulated BER of the $D^{3}$ using convolutional codes with hard decision decoding, using the widely used ( $171$ , $131$ ) convolutional code with a block length of $256$ bits, and a $512\times 512$ channel block interleaver. Moreover, the results without interleaver are considered, which corresponds to the case of slow fading channels with very long coherence time. As it can be noted from the figure, the BER of the $D^{3}$ and coherent-L are comparable for the considered range of SNR when the block interleaver is used. On the contrary, with no interleaving, the $D^{3}$ offers about $5$ dB advantage at $10^{-6}$ . Both detectors are approximately $3$ dB away from the coherent detector with perfect CSI.

VIII Conclusion and Future Work

This work proposed a new receiver design for OFDM-based broadband communication systems. The new receiver performs the detection process directly from the FFT output symbols without the need of experiencing the conventional steps of channel estimation, interpolation, and equalization, which led to a considerable complexity reduction. Moreover, the $D^{3}$ system can be deployed efficiently using the VA. The proposed system was analyzed theoretically where simple closed-form expressions were derived for the BER in several cases of interest. The analytical and simulation results show that the $D^{3}$ BER outperforms the coherent pilot-based receiver in various channel conditions, particularly in frequency-selective channels where the $D^{3}$ demonstrated high robustness.

Although the $D^{3}$ may perform well even in severe fading conditions, it is crucial to evaluate its sensitivity to various practical imperfections. Thus, we will consider in our future work the performance of the $D^{3}$ in the presence of various system imperfections such as phase noise, synchronization errors and IQ imbalance. Moreover, we will evaluate the $D^{3}$ performance in mobile fading channels, where the channel variation may introduce intercarrier interference.

Appendix I

By defining the events $A_{\psi}>A_{n}\triangleq E_{\psi,n}$ , $n\in\left\{0\text{, }1\text{, }\ldots,\psi-1\right\}$ , then,

[TABLE]

Using the chain rule, $P_{C}|_{\mathbf{H}_{0},\mathbf{\mathbf{1}}}$ can be written as,

[TABLE]

For $\mathcal{K}=2$ , $\psi=1$ , $\tilde{\mathbf{d}}_{0}^{(0)}=[1$ , $-1]$ , $\tilde{\mathbf{d}}_{0}^{(1)}=[1$ , $1]$ , and thus,

[TABLE]

For $\mathcal{K}=3$ , $\psi=4$ , $\tilde{\mathbf{d}}_{0}^{(0)}=[1$ , $1$ , $-1]$ , $\tilde{\mathbf{d}}_{0}^{(1)}=[1$ , $-1$ , $-1]$ , $\tilde{\mathbf{d}}_{0}^{(2)}=[1$ , $-1$ , $1]$ and $\tilde{\mathbf{d}}_{0}^{(3)}=[1$ , $1$ ,… $,1]$ . Using the chain rule

[TABLE]

However, $\Pr\left(E_{3,0}\right)=\Pr\left(A_{3}>A_{0}\right)$ , and thus

[TABLE]

The second term in (79) can be evaluated by noting that the events $E_{3,1}$ and $E_{3,0}$ are independent. Therefore $\Pr\left(E_{3,1}|E_{3,0}\right)=\Pr\left(E_{3,1}\right)$ , which can be computed as

[TABLE]

The first term in (79) $\Pr\left(E_{3,2}|E_{3,1}\text{, }E_{3,0}\right)=1$ because if $A_{3}>\left\{A_{1},A_{0}\right\}$ , then $A_{3}>A_{2}$ as well. Consequently,

[TABLE]

By induction, it is straightforward to show that $P_{C}|_{\mathbf{H}_{0},\mathbf{\mathbf{1}}}$ can be written as,

[TABLE]

Bibliography44

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] IEEE Standard for Local and metropolitan area networks Part 16: Air Interface for Broadband Wireless Access Systems Amendment 3: Advanced Air Interface, IEEE Std. 802.16m, 2011.
2[2] LTE; Evolved Universal Terrestrial Radio Access (E-ULTRA), LTE physical layer, 3GPP TS 36.300, 2011.
3[3] T. Hwang, C. Yang, G. Wu, S. Li, and G. Y. Li, “OFDM and its wireless applications: A survey,” IEEE Trans. Veh. Technol., vol. 58, no. 4, pp. 1673–1694, May 2009.
4[4] D. Tsonev, et al. , “A 3-Gb/s single-LED OFDM-based wireless VLC link using a gallium nitride μ 𝜇 \mu LED,” IEEE Photon. Technol. Lett ., vol. 26, no. 7, pp. 637-40, Apr. 2014.
5[5] S. Dissanayake, J. Armstrong, “Comparison of ACO-OFDM, DCO-OFDM and ADO-OFDM in IM/DD systems,” J. Lightw. Technol. , vol. 31, no. 7, pp. 1063-72, Apr. 2013
6[6] P. Guan et al., “5G field trials: OFDM-based waveforms and mixed numerologies,” IEEE J. Sel. Areas Commun. , vol. 35, no. 6, pp. 1234-1243, June 2017.
7[7] M. Agiwal, A. Roy and N. Saxena, “Next generation 5G wireless networks: a comprehensive survey,” IEEE Commun. Surveys & Tutorials, vol. 18, no. 3, pp. 1617-1655, thirdquarter 2016.
8[8] Weile Zhang, Qinye Yin, Wenjie Wang, and Feifei Gao, “One-shot blind CFO and channel estimation for OFDM with multi-antenna receiver,” IEEE Trans. Signal Process. , vol. 62, no. 15, pp. 3799-3808, Aug. 2014.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Direct Data Detection of OFDM Signals Over Wireless Channels

Abstract

Index Terms:

I Introduction

I-A Motivation and Key Contributions

I-B Paper Organization and Notations

II Signal and Channel Models

III Proposed D3D^{3}D3 System Model

IV Efficient Implementation of D3D^{3}D3

IV-A The Viterbi Algorithm (VA)

IV-B Resource Block Detection

IV-C System Design with an Error Control Coding

V Error Rate Analysis of the D3D^{3}D3

V-A *Single-Sided Pilot *

V-B *Double-Sided Pilot *

VI Complexity Analysis

VI-A Complexity of Conventional OFDM Detectors

VI-B Complexity of the D3D^{3}D3

VI-C Complexity with Error Correction Coding

VII Numerical Results

VIII Conclusion and Future Work

Appendix I

III Proposed $D^{3}$ System Model

IV Efficient Implementation of $D^{3}$

V Error Rate Analysis of the $D^{3}$

V-A Single-Sided Pilot

V-B Double-Sided Pilot

VI-B Complexity of the $D^{3}$