Precoding for the Sparsely Spread MC-CDMA Downlink with   Discrete-Alphabet Inputs

Min Li; Chunshan Liu; and Stephen V. Hanly

arXiv:1702.02634·cs.IT·February 10, 2017

Precoding for the Sparsely Spread MC-CDMA Downlink with Discrete-Alphabet Inputs

Min Li, Chunshan Liu, and Stephen V. Hanly

PDF

TL;DR

This paper introduces a power-efficient precoding algorithm for sparse MC-CDMA downlink systems with discrete inputs, reducing complexity while maintaining performance through sparse signatures.

Contribution

It proposes a novel precoding method leveraging sparse signatures for downlink MC-CDMA, improving power efficiency and reducing computational complexity compared to traditional methods.

Findings

01

Significance of signature sparsity for reducing precoding complexity.

02

Power reduction gains over zero-forcing precoding.

03

Sparse MC-CDMA achieves similar throughput with lower complexity.

Abstract

Sparse signatures have been proposed for the CDMA uplink to reduce multi-user detection complexity, but they have not yet been fully exploited for its downlink counterpart. In this work, we propose a Multi-Carrier CDMA (MC-CDMA) downlink communication, where regular sparse signatures are deployed in the frequency domain. Taking the symbol detection point of view, we formulate a problem appropriate for the downlink with discrete alphabets as inputs. The solution to the problem provides a power-efficient precoding algorithm for the base station, subject to minimum symbol error probability (SEP) requirements at the mobile stations. In the algorithm, signature sparsity is shown to be crucial for reducing precoding complexity. Numerical results confirm system-load-dependent power reduction gain from the proposed precoding over the zero-forcing precoding and the regularized zero-forcing…

Tables3

Table 1. TABLE I: Precoding Algorithm with Parallel Computation Units

1) Initialize: dual variables

𝝀^{(0)} > 0

,

𝝂^{(0)}

;

2) Repeat for

T

iterations until convergence criterion is met:

2.1) For

n = 1, \dots, N

:

PSN unit

n

computes its primal variables

{\tilde{𝐱}}_{[2 ​ n - 1 : 2 ​ n]}

using (V-A),

and broadcasts the updated values to its neighboring OSN units;

2.2) For

k = 1, \dots, K

:

OSN unit

k

computes its dual variables

{𝝀_{[4 ​ k - 3 : 4 ​ k]}, 𝝂_{[2 ​ k - 1 : 2 ​ k]}}

using (V-A) and broadcasts the updated values to its neighboring

PSN units.

Table 2. TABLE II: Computational complexity of the proposed algorithm

Schemes \ Operations		“ $+$ ” / iteration	“ $\times$ ” / iteration	complexity ( $T$ iterations)
4-QAM	compute $(22) :$	$4 K L - 2 N$	$4 K L$	$𝒪 (16 K L T)$
4-QAM	compute $(24) :$	$12 K L + 10 K$	$8 K L + 4 K$	$𝒪 (16 K L T)$
16-QAM	compute $(22) :$	$8 K L - 2 N$	$8 K L$	$𝒪 (20 K L T)$
16-QAM	compute $(24) :$	$12 K L + 10 K$	$8 K L + 4 K$	$𝒪 (20 K L T)$

Table 3. TABLE III: Integral Parameters for the 16-QAM Signaling

	$+ \infty / β_{k}$	$β_{k} / β_{k}$	$β_{k} / + \infty$
$+ \infty / β_{k}$	$D_{1}$	$D_{{2, 4}}$	$D_{3}$
$β_{k} / β_{k}$	$D_{{5, 13}}$	$D_{{6, 8, 14, 16}}$	$D_{{7, 15}}$
$β_{k} / + \infty$	$D_{9}$	$D_{{10, 12}}$	$D_{11}$

Equations83

\tilde{y}_{k} = \tilde{h}_{k} \circ x + \tilde{z}_{k},

\tilde{y}_{k} = \tilde{h}_{k} \circ x + \tilde{z}_{k},

y_{k} = n = 1 \sum N h_{k, n} (s_{k, n} \tilde{h}_{k, n}) x_{n} + z_{k} s_{k}^{T} \tilde{z}_{k} = n = 1 \sum N h_{k, n} x_{n} + z_{k},

y_{k} = n = 1 \sum N h_{k, n} (s_{k, n} \tilde{h}_{k, n}) x_{n} + z_{k} s_{k}^{T} \tilde{z}_{k} = n = 1 \sum N h_{k, n} x_{n} + z_{k},

\displaystyle\underbrace{\left[{\begin{array}[]{*{20}c}{y_{1}}\\ {y_{2}}\\ \vdots\\ {y_{K}}\\ \end{array}}\right]}_{{\bf{y}}}=\underbrace{\left[{\begin{array}[]{*{20}c}{h_{1,1}}&\cdots&\cdots&{h_{1,N}}\\ {h_{2,1}}&\cdots&\cdots&{h_{2,N}}\\ \vdots&\vdots&\vdots&\vdots\\ {h_{K,1}}&\cdots&\cdots&{h_{K,N}}\\ \end{array}}\right]}_{{\bf{H}}}\underbrace{\left[{\begin{array}[]{*{20}c}{x_{1}}\\ {x_{2}}\\ \vdots\\ {x_{N}}\\ \end{array}}\right]}_{\bf{x}}+\underbrace{\left[{\begin{array}[]{*{20}c}{z_{1}}\\ {z_{2}}\\ \vdots\\ {z_{K}}\\ \end{array}}\right]}_{{\bf{z}}}.

\displaystyle\underbrace{\left[{\begin{array}[]{*{20}c}{y_{1}}\\ {y_{2}}\\ \vdots\\ {y_{K}}\\ \end{array}}\right]}_{{\bf{y}}}=\underbrace{\left[{\begin{array}[]{*{20}c}{h_{1,1}}&\cdots&\cdots&{h_{1,N}}\\ {h_{2,1}}&\cdots&\cdots&{h_{2,N}}\\ \vdots&\vdots&\vdots&\vdots\\ {h_{K,1}}&\cdots&\cdots&{h_{K,N}}\\ \end{array}}\right]}_{{\bf{H}}}\underbrace{\left[{\begin{array}[]{*{20}c}{x_{1}}\\ {x_{2}}\\ \vdots\\ {x_{N}}\\ \end{array}}\right]}_{\bf{x}}+\underbrace{\left[{\begin{array}[]{*{20}c}{z_{1}}\\ {z_{2}}\\ \vdots\\ {z_{K}}\\ \end{array}}\right]}_{{\bf{z}}}.

Pr (y_{k} = (\overset{y}{ˉ}_{k} + z_{k}) \neq \in A (d_{k})) \leq P e_{k} .

Pr (y_{k} = (\overset{y}{ˉ}_{k} + z_{k}) \neq \in A (d_{k})) \leq P e_{k} .

x = H^{†} (H H^{†})^{- 1} d,

x = H^{†} (H H^{†})^{- 1} d,

D_{S} = {a_{R} + j a_{I} a_{R}, a_{I} \in {\pm 1, \pm 3, \dots, \pm (M - 1)}}

D_{S} = {a_{R} + j a_{I} a_{R}, a_{I} \in {\pm 1, \pm 3, \dots, \pm (M - 1)}}

\frac{1}{2 π σ} \int_{- β_{k}}^{+ \infty} e^{- \frac{( z _{k}^{(r)} ) ^{2}}{2 σ ^{2}}} d z_{k}^{(r)}

\frac{1}{2 π σ} \int_{- β_{k}}^{+ \infty} e^{- \frac{( z _{k}^{(r)} ) ^{2}}{2 σ ^{2}}} d z_{k}^{(r)}

\geq 1 - P e_{k},

β_{k}^{-} = - σ Q^{- 1} (1 - P e_{k}),

β_{k}^{-} = - σ Q^{- 1} (1 - P e_{k}),

\frac{1}{2 π σ} \int_{- β_{k}}^{+ β_{k}} e^{- \frac{( z _{k}^{(r)} ) ^{2}}{2 σ ^{2}}} d z_{k}^{(r)}

\frac{1}{2 π σ} \int_{- β_{k}}^{+ β_{k}} e^{- \frac{( z _{k}^{(r)} ) ^{2}}{2 σ ^{2}}} d z_{k}^{(r)}

\geq 1 - P e_{k},

β_{k}^{-} = σ Q^{- 1} (0.5 - 0.5 1 - P e_{k}) .

β_{k}^{-} = σ Q^{- 1} (0.5 - 0.5 1 - P e_{k}) .

\displaystyle{\bf\cal P}:\left\{{\begin{array}[]{*{20}l}{\min\limits_{\mathbf{x}\in{\mathbb{C}}^{N\times 1}}P\left({\bf x}\right)=\mathbf{x}^{\dagger}\mathbf{x}}\\ {\text{subject to\>\>}\Pr\left(\left({\bar{y}}_{k}+z_{k}\right)\not\in\mathcal{A}(d_{k})\right)\leq Pe_{k},}\\ {{\text{for the transmitted data set\>}}\{d_{k}\in{\cal D}_{k},\;k=1,\ldots,K\}}.\\ \end{array}}\right.

\displaystyle{\bf\cal P}:\left\{{\begin{array}[]{*{20}l}{\min\limits_{\mathbf{x}\in{\mathbb{C}}^{N\times 1}}P\left({\bf x}\right)=\mathbf{x}^{\dagger}\mathbf{x}}\\ {\text{subject to\>\>}\Pr\left(\left({\bar{y}}_{k}+z_{k}\right)\not\in\mathcal{A}(d_{k})\right)\leq Pe_{k},}\\ {{\text{for the transmitted data set\>}}\{d_{k}\in{\cal D}_{k},\;k=1,\ldots,K\}}.\\ \end{array}}\right.

O^{(r)} \frac{1}{2 π σ} \int_{I_{-}^{(r)} (d_{k})}^{I_{+}^{(r)} (d_{k})} e^{- \frac{( z _{k}^{(r)} ) ^{2}}{2 σ ^{2}}} d z_{k}^{(r)}

O^{(r)} \frac{1}{2 π σ} \int_{I_{-}^{(r)} (d_{k})}^{I_{+}^{(r)} (d_{k})} e^{- \frac{( z _{k}^{(r)} ) ^{2}}{2 σ ^{2}}} d z_{k}^{(r)}

\times O^{(i)} \frac{1}{2 π σ} \int_{I_{-}^{(i)} (d_{k})}^{I_{+}^{(i)} (d_{k})} e^{- \frac{( z _{k}^{(i)} ) ^{2}}{2 σ ^{2}}} d z_{k}^{(i)} \geq 1 - P e_{k},

\displaystyle{\bf\cal P}_{1}:\left\{{\begin{array}[]{*{20}l}{\min\limits_{\tilde{\mathbf{x}}\in\mathbb{R}^{2N\times 1}}P\left({\bf\tilde{x}}\right)=\tilde{\mathbf{x}}^{T}\tilde{\mathbf{x}}}\\ {\text{s. t. \>\>}\mathbf{A}\tilde{\mathbf{x}}-\mathbf{c}\preceq\mathbf{0},}\\ {~{}\quad\quad\mathbf{B}\tilde{\mathbf{x}}-\mathbf{e}=\mathbf{0},}\end{array}}\right.

\displaystyle{\bf\cal P}_{1}:\left\{{\begin{array}[]{*{20}l}{\min\limits_{\tilde{\mathbf{x}}\in\mathbb{R}^{2N\times 1}}P\left({\bf\tilde{x}}\right)=\tilde{\mathbf{x}}^{T}\tilde{\mathbf{x}}}\\ {\text{s. t. \>\>}\mathbf{A}\tilde{\mathbf{x}}-\mathbf{c}\preceq\mathbf{0},}\\ {~{}\quad\quad\mathbf{B}\tilde{\mathbf{x}}-\mathbf{e}=\mathbf{0},}\end{array}}\right.

\displaystyle{\bf H}=\left[{\begin{array}[]{*{20}c}{1+j}&{-1+j}&{0}\\ {0}&{1+j}&{1-j}\\ \end{array}}\right].

\displaystyle{\bf H}=\left[{\begin{array}[]{*{20}c}{1+j}&{-1+j}&{0}\\ {0}&{1+j}&{1-j}\\ \end{array}}\right].

\displaystyle\left[{\begin{array}[]{*{20}c}{{\bar{y}}_{1}}\\ {{\bar{y}}_{2}}\\ \end{array}}\right]={\bf H}{\bf x}=\left[{\begin{array}[]{*{20}c}{(1+j)x_{1}+(-1+j)x_{2}}\\ {(1+j)x_{2}+(1-j)x_{3}}\\ \end{array}}\right].

\displaystyle\left[{\begin{array}[]{*{20}c}{{\bar{y}}_{1}}\\ {{\bar{y}}_{2}}\\ \end{array}}\right]={\bf H}{\bf x}=\left[{\begin{array}[]{*{20}c}{(1+j)x_{1}+(-1+j)x_{2}}\\ {(1+j)x_{2}+(1-j)x_{3}}\\ \end{array}}\right].

β - δ_{0} \leq \overset{y}{ˉ}_{1}^{(r)} \leq β + δ_{0}, β - δ_{0} \leq \overset{y}{ˉ}_{1}^{(i)} \leq β + δ_{0},

β - δ_{0} \leq \overset{y}{ˉ}_{1}^{(r)} \leq β + δ_{0}, β - δ_{0} \leq \overset{y}{ˉ}_{1}^{(i)} \leq β + δ_{0},

- \overset{y}{ˉ}_{2}^{(r)} \leq - (2 β - σ Q^{- 1} (1 - P e)),

- \overset{y}{ˉ}_{2}^{(i)} \leq - (2 β - σ Q^{- 1} (1 - P e)),

\displaystyle{\bf A}=\left[{\begin{array}[]{*{20}c}{1}&{-1}&{-1}&{-1}&{0}&{0}\\ {-1}&{1}&{1}&{1}&{0}&{0}\\ {1}&{1}&{1}&{-1}&{0}&{0}\\ {-1}&{-1}&{-1}&{1}&{0}&{0}\\ {0}&{0}&{-1}&{1}&{-1}&{-1}\\ {0}&{0}&{0}&{0}&{0}&{0}\\ {0}&{0}&{-1}&{-1}&{1}&{-1}\\ {0}&{0}&{0}&{0}&{0}&{0}\\ \end{array}}\right]

\displaystyle{\bf A}=\left[{\begin{array}[]{*{20}c}{1}&{-1}&{-1}&{-1}&{0}&{0}\\ {-1}&{1}&{1}&{1}&{0}&{0}\\ {1}&{1}&{1}&{-1}&{0}&{0}\\ {-1}&{-1}&{-1}&{1}&{0}&{0}\\ {0}&{0}&{-1}&{1}&{-1}&{-1}\\ {0}&{0}&{0}&{0}&{0}&{0}\\ {0}&{0}&{-1}&{-1}&{1}&{-1}\\ {0}&{0}&{0}&{0}&{0}&{0}\\ \end{array}}\right]

\displaystyle{\bf c}=\left[{\begin{array}[]{*{20}c}{\beta+\delta_{0}}\\ {-(\beta-\delta_{0})}\\ {\beta+\delta_{0}}\\ {-(\beta-\delta_{0})}\\ {-\left(2\beta-\sigma Q^{-1}\left(\sqrt{1-Pe}\right)\right)}\\ {0}\\ {-\left(2\beta-\sigma Q^{-1}\left(\sqrt{1-Pe}\right)\right)}\\ {0}\\ \end{array}}\right].

L (\tilde{x}, λ, ν) = \tilde{x}^{T} \tilde{x} + λ^{T} (A \tilde{x} - c) + ν^{T} (B \tilde{x} - e),

L (\tilde{x}, λ, ν) = \tilde{x}^{T} \tilde{x} + λ^{T} (A \tilde{x} - c) + ν^{T} (B \tilde{x} - e),

λ, ν max g (λ, ν), subject to λ ⪰ 0,

λ, ν max g (λ, ν), subject to λ ⪰ 0,

\tilde{x}^{* (t)} = - \frac{1}{2} (A^{T} λ^{(t)} + B^{T} ν^{(t)}),

\tilde{x}^{* (t)} = - \frac{1}{2} (A^{T} λ^{(t)} + B^{T} ν^{(t)}),

\tilde{x}_{l}^{* (t)} = - \frac{1}{2} i \in I (\tilde{x}_{l}) \sum a_{i, l} λ_{i}^{(t)} + i \in I^{'} (\tilde{x}_{l}) \sum b_{i, l} ν_{i}^{(t)},

\tilde{x}_{l}^{* (t)} = - \frac{1}{2} i \in I (\tilde{x}_{l}) \sum a_{i, l} λ_{i}^{(t)} + i \in I^{'} (\tilde{x}_{l}) \sum b_{i, l} ν_{i}^{(t)},

l = 1, \dots, 2 N,

g (λ, ν) = \tilde{x}^{* (t)^{T}} \tilde{x}^{* (t)} + λ^{T} (A \tilde{x}^{* (t)} - c) + ν^{T} (B \tilde{x}^{* (t)} - e) .

g (λ, ν) = \tilde{x}^{* (t)^{T}} \tilde{x}^{* (t)} + λ^{T} (A \tilde{x}^{* (t)} - c) + ν^{T} (B \tilde{x}^{* (t)} - e) .

λ_{i}^{(t + 1)} = [λ_{i}^{(t)} + \frac{t - 1}{t + 2} (λ_{i}^{(t)} - λ_{i}^{(t - 1)})

λ_{i}^{(t + 1)} = [λ_{i}^{(t)} + \frac{t - 1}{t + 2} (λ_{i}^{(t)} - λ_{i}^{(t - 1)})

+ \frac{1}{2 κ} l \in I (λ_{i}) \sum a_{i, l} \overset{x}{^}_{l}^{(t)} - c_{i}^{+}, i = 1, \dots, 4 K,

ν_{j}^{(t + 1)} = ν_{j}^{(t)} + \frac{t - 1}{t + 2} (ν_{j}^{(t)} - ν_{j}^{(t - 1)})

+ \frac{1}{2 κ} l \in I (ν_{j}) \sum b_{j, l} \overset{x}{^}_{l}^{(t)} - e_{j}, j = 1, \dots, 2 K,

\overset{x}{^}_{l}^{(t)} = \tilde{x}_{l}^{* (t)} + \frac{t - 1}{t + 2} (\tilde{x}_{l}^{* (t)} - \tilde{x}_{l}^{* (t - 1)}), l = 1, \dots, 2 N .

\overset{x}{^}_{l}^{(t)} = \tilde{x}_{l}^{* (t)} + \frac{t - 1}{t + 2} (\tilde{x}_{l}^{* (t)} - \tilde{x}_{l}^{* (t - 1)}), l = 1, \dots, 2 N .

d = D_{l} + 2 β_{k} M (a_{R} + j a_{I}),

d = D_{l} + 2 β_{k} M (a_{R} + j a_{I}),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Precoding for the Sparsely Spread MC-CDMA Downlink with Discrete-Alphabet Inputs

Min Li, , Chunshan Liu, , and Stephen V. Hanly Copyright (c) 2015 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to [email protected]. This work was presented in part in Proceedings of the IEEE International Conference on Communications (ICC), Sydney, Australia, June 2014. This research was supported in part by the Australian Research Council under grant DP130101760, and by the CSIRO Macquarie University Chair in Wireless Communications. This Chair has been established with funding provided by the Science and Industry Endowment Fund. The authors are with the Department of Engineering, Macquarie University, Macquarie Park, NSW 2113, Australia (e-mail: {min.li, chunshan.liu, stephen.hanly}@mq.edu.au).

Abstract

Sparse signatures have been proposed for the CDMA uplink to reduce multi-user detection complexity, but they have not yet been fully exploited for its downlink counterpart. In this work, we propose a Multi-Carrier CDMA (MC-CDMA) downlink communication, where regular sparse signatures are deployed in the frequency domain. Taking the symbol detection point of view, we formulate a problem appropriate for the downlink with discrete alphabets as inputs. The solution to the problem provides a power-efficient precoding algorithm for the base station, subject to minimum symbol error probability (SEP) requirements at the mobile stations. In the algorithm, signature sparsity is shown to be crucial for reducing precoding complexity. Numerical results confirm system-load-dependent power reduction gain from the proposed precoding over the zero-forcing precoding and the regularized zero-forcing precoding with optimized regularization parameter under the same SEP targets. For a fixed system load, it is also demonstrated that sparse MC-CDMA with a proper choice of sparsity level attains almost the same power efficiency and link throughput as that of dense MC-CDMA yet with reduced precoding complexity, thanks to the sparse signatures.

Index Terms:

CDMA, discrete alphabets, MC-CDMA, power efficiency, precoding, sparse signature, symbol error probability.

I Introduction

I-A Motivations and Contributions

Multi-Carrier Code Division Multiple Access (MC-CDMA) is a multi-access scheme based on the Orthogonal Frequency Division Multiplexing (OFDM) method. Since its invention, MC-CDMA has attracted broad interest, see, e.g., [1, 2, 3, 4] and the references therein. MC-CDMA naturally integrates CDMA’s flexible multiuser access with interference suppression capability and the advantages of multicarrier OFDM, including robustness against frequency-selective fading. Therefore, it has the potential to be one of the candidates to support massive access and provide reliable data communication and better coverage for future-generation wireless systems.

As in all CDMA systems, MC-CDMA may experience severe multi-access interference due to the loss of user orthogonality, which may occur, particularly, in frequency-selective channel environments. For such systems, optimal detection entails an exponential number of hypothesis testings about data symbols of all users and thus could be computationally demanding, time-consuming and even infeasible in a large system with conventional dense signature design. To circumvent the complexity issue, sparse signatures, whose fraction of non-zero entries is small, have been introduced and exploited for the CDMA uplink multi-user detection [5, 6, 7, 8, 9, 10]. In particular, the belief-propagation algorithm has been proposed for such a system, an algorithm that has a natural implementation using parallel computation units, and one that is fast and provably optimal for different ensembles of sparse signatures in the large system limits [5, 6, 7, 8]. Inspired by the belief-propagation algorithm, references [9] and [10] have developed reduced-complexity soft-in soft-out (SISO) and Turbo iterative multiuser detection algorithms for the sparse CDMA uplink and a sparse-signature OFDM uplink, respectively.

In this work, we formulate a different problem, appropriate for the MC-CDMA downlink counterpart, from the symbol detection point of view. The solution to the problem provides a power-efficient precoding algorithm for the base station (BS). The precoding is implemented at the BS, allowing the mobile stations (MSs) to use simple conventional single-user matched filters and standard single-user symbol detection. This hence simplifies the implementation of the receiver, as compared to the conventional MC-CDMA downlink transceiver design, where multi-user interference is normally mitigated by a frequency-domain equalization at the receiver [2]. Moreover, the proposed algorithm optimizes the signals transmitted on different subcarriers so that they can be constructively combined at each receiver, leading to a power-efficient precoding. The use of random sparse signatures was suggested for the MC-CDMA downlink in [11] to allow low-complexity iterative multiuser detection at each MS. But here we show that sparsity can be exploited to reduce the complexity of the precoding, as compared to MC-CDMA with dense signatures. In addition, using sparse signatures simplifies channel measurement, since each MS only needs to estimate channels for a small number of subcarriers it occupies.

The contributions of this paper are summarized as follows:

•

We introduce the MC-CDMA downlink communication with regular sparse signatures, where each MS has access to an equal number of subcarriers and each subcarrier has (roughly) equal load. We also consider a bipartite graph representation for the system studied and use it to facilitate algorithm design and complexity analysis.

•

Assume that data symbols intended for MSs are drawn from discrete alphabets. We take the symbol detection point of view and introduce the minimum Symbol Error Probability (SEP) as a Quality of Service (QoS) metric for each MS. We formulate the precoding problem as a transmit power optimization problem subject to minimum SEP requirements at MSs. We translate the SEP targets into constraint regions on noiseless received signal components at MSs and characterize them via a conservative approximation to make the problem tractable. Detailed formulation procedures are provided for systems with both standard 4/16-QAM constellations and Tomlinson-Harashima replica points.

•

We develop a precoding algorithm that accommodates parallel computation units via the dual-decomposition theory. Aided by the graph representation of the system, the complexity of the algorithm is characterized in terms of the number of message passings between computation units and the number of additions and multiplications for precoding calculation. Signature sparsity is shown to play a vital role in reducing precoding complexity.

•

We demonstrate that the proposed optimized precoding generally outperforms the conventional zero-forcing (ZF) and the optimized Regularized ZF (RZF) precodings in terms of power efficiency under the same SEP requirements. The exact gain depends on the system load and it is very significant for a fully loaded system. We also demonstrate that, for a fixed system load, sparse MC-CDMA, e.g., with a proper relatively small number of subcarriers allocated to each MS, attains almost the same power efficiency and link throughput as dense MC-CDMA under our proposed precoding scheme. This important observation, in conjunction with the fact that sparsity reduces precoding complexity, promotes the practicality of sparse MC-CDMA.

I-B Other Related Work

Precoding is a relatively mature concept in multi-antenna communication systems, enabling multiuser multiplexing in the spatial domain, see, e.g., [12, 13, 14, 15, 16, 17, 18, 19, 20]. It has to balance between the two conflicting interests of maximizing the useful signal power at the intended user and minimizing interference leakage towards non-intended users. The same concept can be applied to other systems such as direct-sequence CDMA or MC-CDMA, where multiplexing takes place in the time- or frequency-domain, and multi-access interference, if it arises, has to be dealt with [21, 22, 23, 24, 25].

Existing precoding techniques can be divided into two categories, linear precoding and non-linear precoding, where both require channel state information while the latter requires additional symbol-based processing. Within the first category, matched filtering, ZF and RZF [12] are three commonly known precodings that maintain different levels of balance between the two conflicting goals. Other power-aware linear precodings of general form have also been proposed in the literature subject to different QoS metrics, such as signal-to-noise-plus-interference ratio [20].

Compared with linear precoding, non-linear precoding may offer higher power efficiency and transmission rate, but the gain comes at the cost of incorporating more sophisticated signal processing [13, 15, 16, 17, 23, 24, 25]. In the multi-antenna broadcasting setup, capacity-achieving non-linear Dirty-Paper Coding (DPC) entails a successive pre-cancelation of known intra-user interference at the BS. The encoding of data relies on codewords of infinite length and involves a high-dimensional sphere-search algorithm, which renders DPC unattractive in practical systems. Tomlinson-Harashima precoding (THP) is a simplified version of DPC, where the codebook is comprised of periodic extension of standard constellations (replica points) in the two-dimensional space and a transmit modulo-operation is introduced in the interference pre-cancelation process in order to reduce transmit power. Built on ZF or RZF, reference [13] generalizes the single-user-based symbol extension idea of THP and introduces a joint perturbation of user symbol vector to further reduce transmit power.

The optimized precoding proposed in this work belongs to the second category. As in existing works, power consumption is one of the primary concerns in our optimized precoding. However, the optimization criterion, minimum SEP constraint, has not been considered before, except our own works [26, 27, 28] in the MIMO (or distributed MIMO) setup. This criterion appears to be natural when we consider a system with discrete alphabets as inputs. In addition, in our formulation, we fix the information-bearing alphabets at the BS but allow a certain relaxation of received signals at MSs through precoding, as long as they reside in detection-favorable regions and the SEP targets are met. This is distinguished from related works [16, 18, 19, 25], where non-linear relaxation of input alphabets has been adopted. In [16, 18], the relaxation is required to maintain the minimum signalling distance, while in [25, 19], the relaxation is to ensure the corresponding symbol-energy-to-noise ratio is above a certain threshold [25, 19]. However, in all these works, no explicit SEP targets are imposed at MSs.

Notation: Boldface uppercase and lowercase letters denote matrices and vectors, respectively, e.g., $\mathbf{A}$ is a matrix and $\mathbf{a}$ is a vector; ${\bf I}_{N}$ is an $N\times N$ identity matrix; for integers $i\leq j$ , $[i:j]=\{i,i+1,\dots,j\}$ , is the discrete interval between $i$ and $j$ , and $\mathbf{a}_{[i:j]}=\{a_{i},\dots,a_{j}\}$ , is the collection of $[i:j]$ th components of vector $\mathbf{a}$ ; $(\cdot)^{T}$ denotes the matrix transpose, while $(\cdot)^{\dagger}$ denotes the conjugate transpose; notation ${\mathsf{E}}[X]$ denotes the expectation operation on random variable $X$ , and $\left\lfloor{x}\right\rfloor$ denotes a floor function of real number $x$ ; $\Re\{\cdot\}$ and $\Im\{\cdot\}$ denote the real and imaginary part of a complex number, respectively; finally, $\|\mathbf{A}\|_{p}$ is the standard $l_{p}$ norm of $\mathbf{A}$ .

II System Model

II-A Signaling Model

We consider a downlink communication, where a single-antenna BS is simultaneously serving $K$ single-antenna MSs via MC-CDMA. Specifically, data symbols intended for MSs are all drawn from discrete alphabet sets, e.g., $M$ -QAM constellations, common in practical deployments. The downlink communication takes place over a set of $N$ orthogonal subcarriers where we assume $N\geq K$ and thus the load $\alpha={K}/{N}\in\left(0,1\right]$ . In the conventional MC-CDMA, the data symbol intended for a MS is transmitted over all parallel subcarriers where each is encoded with a binary phase-offset [1]. Here, however, information associated with each MS is assumed to be spread onto only a small subset of the available subcarriers, which leads to the sparse MC-CDMA as originally studied by [5, 6, 7, 8] for the uplink.

Let ${\bf s}_{k}=\frac{1}{\sqrt{L_{k}}}\left[{\tilde{s}}_{k,1},{\tilde{s}}_{k,2},\ldots,{\tilde{s}}_{k,N}\right]^{T}$ be the signature for MS $k$ , where normalization factor $L_{k}$ corresponds to the total number of subcarriers allocated to MS $k$ . In the signature, components ${\tilde{s}}_{k,n}$ are i.i.d. drawn from a distribution $P_{S}$ with zero-mean and unit-variance if MS $k$ has access to subcarrier $n$ , and ${\tilde{s}}_{k,n}=0$ otherwise. The collection of all signatures corresponds to a sparse signature matrix ${\bf S}=\left[{{\bf s}_{1},{\bf s}_{2},\ldots,{\bf s}_{K}}\right]$ , which is perfectly known at the BS.

The transceiver architecture for the downlink transmission is depicted in Fig. 1 and is elaborated as follows. Let ${\bf d}=\left[d_{1},\ldots,d_{K}\right]^{T}$ be the transmitted data symbol vector, where component $d_{k}$ denotes the symbol intended for MS $k$ and is drawn from a discrete finite-alphabet set ${\cal D}_{k}$ . The transmission takes place by first forming appropriate frequency-domain signals and then converting them into time-domain signals by the inverse fast Fourier transform (IFFT) at the BS. Specifically, symbol vector ${\bf d}$ is passed through a precoder and mapped to coded vector ${\bf x}\in{\mathbb{C}}^{N}$ . An IFFT is then applied over the coded vector in order to generate time-domain signal vector ${\bf{x}}{{}^{\prime}}\in{\mathbb{C}}^{N}$ that is subsequently transmitted over the wireless channel.

Upon observing channel output, each MS $k$ first performs an FFT and produces the frequency domain signal ${\bf\tilde{y}}_{k}$ as

[TABLE]

where notation “ $\circ$ ” denotes the Hadamard product; vector ${\bf\tilde{h}}_{k}=[{\tilde{h}}_{k,1},{\tilde{h}}_{k,2},\ldots,{\tilde{h}}_{k,N}]^{T}$ is the collection of frequency-domain channel gains from BS to MS $k$ ; vector ${\bf\tilde{z}}_{k}$ is a circularly symmetric complex Gaussian noise with ${\mathsf{E}}[{\bf\tilde{z}}_{k}{\bf\tilde{z}}_{k}^{{\dagger}}]=N_{0}{\bf I}_{N}$ . Despreading is then performed at MS $k$ based on its own signature ${\bf s}_{k}$ , followed by a simple single-user detection. The corresponding output signal $y_{k}$ after despreading is given by

[TABLE]

where $s_{k,n}={\tilde{s}}_{k,n}/\sqrt{L_{k}}$ is the $n$ th component of ${\bf s}_{k}$ , and each equivalent channel noise ${z}_{k}$ is circularly symmetric complex Gaussian with zero mean and variance $N_{0}$ . Collecting all outputs at MSs, we obtain the equivalent system input-output relationship as

[TABLE]

It is straightforward to observe that in matrix ${\bf H}$ , $h_{k,n}=0$ as long as $s_{k,n}=0$ , and thus row ${\bf h}_{k}$ maintains the same level of sparsity as the corresponding signature ${\bf s}_{k}$ . This also means the BS only needs to know the small number of $h_{k,n}$ s for which $s_{k,n}\neq 0$ for the purpose of precoding.

II-B Graph Representation

Given the matrix ${\bf H}$ from (19), we can alternatively construct a bipartite graph representation of the sparse MC-CDMA system. Assume that each symbol $x_{n}$ in the graph is represented by a precoded symbol node (PSN), and each output $y_{k}$ is represented by a output symbol node (OSN). PSNs will be drawn as circles and OSNs will be drawn as squares in the graph. PSN $x_{n}$ is connected with OSN $y_{k}$ only if $s_{k,n}\neq 0$ and $h_{k,n}$ is the weight associated with the edge. Fig. 2 depicts an instance of the graph ${\cal G}$ for $L_{k}=2$ , where each OSN is connected with two PSNs. We use ${\cal I}(x_{n})$ to denote the collection of OSNs connected to $x_{n}$ and define the node degree of $x_{n}$ as the cardinality $\left|{\cal I}(x_{n})\right|$ . Similarly, we use ${\cal I}(y_{k})$ to denote the collection of PSNs connected to $y_{k}$ and define the node degree of $y_{k}$ as the cardinality $\left|{\cal I}(y_{k})\right|$ . This graph representation introduced will facilitate the description of the precoding algorithm and the corresponding complexity analysis in Section V.

II-C Sparse Signature Ensemble

In the signature matrix, we assume that the non-zero elements $\{{\tilde{s}}_{k,n}\}$ are i.i.d. drawn from a uniform distribution on $\{+1,-1\}$ . It is observed that generating non-zero elements according to other distributions, e.g., Gaussian distribution, has little impact on the averaged system performance. Hence, we stick to the binary uniform distribution, which leads to a binary phase-offset for the symbol transmitted as in [6, 7].

Depending on the number of subcarriers allocated across MSs and the load per subcarrier, we have three common signature ensembles suggested for the uplink [7]: i) irregular ensemble, where Poisson-distributed number of subcarriers are allocated across MSs and the load per subcarrier is also Poisson-distributed; ii) semi-regular ensemble, where each MS is allocated a fixed positive integer number of subcarriers and the load per subcarrier is Poisson-distributed; and iii) regular ensemble, where the number of subcarriers allocated for each MS and the loading per subcarrier both take fixed positive integer values. In particular, [7] advocates the regular ensemble as it amongst others prevents the systematic inefficiency due to leaving some subcarriers unoccupied by any of MSs.

In this work, we follow [7] and deploy the regular-type ensemble to ensure that the system enjoys full utilization of resources and provides user fairness. Specifically, when the system is fully-loaded ( $\alpha=1$ ), a perfectly regular signature matrix is randomly generated in the sense that each MS is allocated $L$ subcarriers, and each subcarrier is accessed by exactly $L$ MSs. When the system is under-loaded ( $\alpha<1$ ), a nearly regular signature matrix is randomly generated such that each MS is allocated $L$ subcarriers, and each subcarrier is of almost equal load, namely, accessed by either $\left\lfloor{\alpha L}\right\rfloor$ or $(\left\lfloor{\alpha L}\right\rfloor+1)$ MSs.

III Optimized Precoding with SEP Targets

For notational convenience, we define $\overline{y}_{k}=\mathbf{h}^{T}_{k}\mathbf{x}$ as the noiseless received component at MS $k$ ( $\mathbf{h}^{T}_{k}$ is the $k$ th row of matrix ${\bf H}$ ) with real part $\overline{y}^{(r)}_{k}=\Re\{\overline{y}_{k}\}$ and imaginary part $\overline{y}^{(i)}_{k}=\Im\{\overline{y}_{k}\}$ ; similarly, denote the real and imaginary parts of data symbol and noise as $d^{(r)}_{k}=\Re\{d_{k}\}$ and $d^{(i)}_{k}=\Im\{d_{k}\}$ , and $z^{(r)}_{k}=\Re\{z_{k}\}$ and $z^{(i)}_{k}=\Im\{z_{k}\}$ , respectively. In addition, define $\sigma^{2}=N_{0}/{2}$ as the fixed noise variance per signal dimension.

Assuming that all data symbols transmitted are selected from discrete alphabets, we take a symbol detection point of view and impose minimum Symbol Error Probabilities (SEPs) as the user QoS constraints. Specifically, let ${\cal A}(d_{k})$ denote the decision region associated with data symbol $d_{k}$ intended for MS $k$ and $Pe_{k}$ denote the SEP target. Detection error happens when the output signal $y_{k}$ lies outside decision region ${\cal A}(d_{k})$ . According to the SEP requirement, the probability of error events should be no greater than the target, i.e.,

[TABLE]

A question we then ask is: How do we design a precoder that efficiently maps a symbol vector ${\bf d}$ into a precoded vector ${\bf x}$ such that the SEP requirements at MSs can be met?

Following the conventional zero-forcing (ZF) approach, one could form ${\bf x}$ according to

[TABLE]

which inverts the channel matrix and forces noiseless component $\bar{y}_{k}$ to sit exactly at the constellation point $d_{k}$ . In order to meet a given SEP target, $Pe_{k}$ , data symbol $d_{k}$ has to be chosen from a discrete alphabet set whose minimum distance between any two neighboring points (denoted by ${\sf d}_{\text{min}}$ ) is above a certain threshold.

For instance, consider a system with $M$ -QAM modulation whose standard constellation is represented by

[TABLE]

with ${\sf d}_{\text{min}}=2$ . We need to scale the constellation points (increasing the minimum distance ${\sf d}_{\text{min}}$ , but also the transmit power) in order to meet the SEP target, $Pe_{k}$ . Considering the 4-QAM constellation (see Fig. 3), the scaling factor, $\beta_{k}$ , to use for MS $k$ , must satisfy

[TABLE]

as implied by (20). Thus the minimum scaling factor under the conventional ZF approach for the 4-QAM system is given by

[TABLE]

where $Q^{-1}(.)$ denotes the inverse of the standard $Q$ -function [29]. When $M\geq 16$ (see Fig. 4 for 16-QAM), the standard constellation ${\cal D}_{\text{S}}$ should be scaled so that

[TABLE]

as implied by (20), considering the dominant scenario in which one of the inner most points is transmitted. Thus the minimum scaling factor under the conventional ZF approach is given by

[TABLE]

In general, however, we do not have to zero-force ${\bar{y}}_{k}$ , and in fact, it is sufficient to ensure ${\bar{y}}_{k}$ falls into a region that favours correct symbol detection. This relaxation introduces room to optimize the choice of ${\bf x}$ , leading to the following transmit power minimization problem:

[TABLE]

IV Sparse MC-CDMA with Standard $M$ -QAM Constellations

In this section, we show how to translate the set of SEP targets in (30) into constraints on noiseless output components at MSs. In particular, we begin with the 4-QAM signaling case and then generalize to the 16-QAM signaling case. A similar approach can be applied to systems with higher-order QAM constellations.

IV-A Translate SEP Targets to Constraints on Noiseless Received Signal Components

IV-A1 4-QAM

Assume that each $d_{k}$ is drawn from the 4-QAM constellation set ${\cal{D}}_{k}=\{D_{m}:m=1,\dots,4\}$ as shown in Fig. 3, where the green dashed lines partition the complex plane into four symmetric decision regions each occupying an open quarter plane. Any received signals falling outside the correct region lead to detection error. Thus, the SEP requirement (20) becomes

[TABLE]

where the tuple $(I^{(r)}_{-}(d_{k}),I^{(r)}_{+}(d_{k}),I^{(i)}_{-}(d_{k}),I^{(i)}_{+}(d_{k}))$ depends on the data transmitted and equals:

(i)

${(-\infty,-\overline{y}^{(r)}_{k},-\infty,-\overline{y}^{(i)}_{k})}$ for $D_{1}$ ; 2. (ii)

$(-\infty,-\overline{y}^{(r)}_{k},-\overline{y}^{(i)}_{k},+\infty)$ for $D_{2}$ ; 3. (iii)

$(-\overline{y}^{(r)}_{k},+\infty,-\infty,-\overline{y}^{(i)}_{k})$ for $D_{3}$ ; 4. (iv)

$(-\overline{y}^{(r)}_{k},+\infty,-\overline{y}^{(i)}_{k},+\infty)$ for $D_{4}$ .

Given symbol $d_{k}\in{\cal D}_{k}$ and a target $Pe_{k}$ , one can determine the precise constraint region ${\cal B}\left(d_{k}\right)$ on noiseless output $\overline{y}_{k}$ at MS $k$ from inequality (IV-A1). In particular, the boundary of the region is determined by equality in (IV-A1). For example, when $d_{k}=D_{4}$ , three points on the boundary of ${\cal B}\left(d_{k}\right)$ are identified by considering combinations of $\left(\mathcal{O}^{(r)},\mathcal{O}^{(i)}\right)$ :

(i)

$\left(1,1-Pe_{k}\right)$ : $\overline{y}^{(r)}_{k}=+\infty$ , $\overline{y}^{(i)}_{k}=-{\sigma}{Q}^{-1}(1-Pe_{k})$ ; 2. (ii)

$\left(1-Pe_{k},1\right)$ : $\overline{y}^{(r)}_{k}=-{\sigma}{Q}^{-1}(1-Pe_{k})$ , $\overline{y}^{(i)}_{k}=+\infty$ ; 3. (iii)

$\left(\sqrt{1-Pe_{k}},\sqrt{1-Pe_{k}}\right)$ :

$~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\overline{y}^{(r)}_{k}=\overline{y}^{(i)}_{k}=-{\sigma}{Q}^{-1}\left(\sqrt{1-Pe_{k}}\right).$

In principle, the curved-shape boundary can be determined by traversing all possible combinations of $\mathcal{O}^{(r)}$ and $\mathcal{O}^{(i)}$ . The constraint region ${\cal B}(d_{k})$ then includes all the points on and within the boundary (see the red curve in Fig. 3). Note that point $({\text{iii}})$ is exactly a scaled constellation point. Also note that this region generally implies non-linear constraints on input signal ${\bf x}$ , which makes the optimization problem less tractable. Alternatively, we can find a polytype contained in ${\cal B}(d_{k})$ , i.e., we conservatively approximate the region using the area bounded by line segments between a finite number of points on or within boundary. A simple approximation for $d_{k}=D_{4}$ is given by: $\overline{y}^{(r)}_{k}\geq-{\sigma}{Q}^{-1}\left(\sqrt{1-Pe_{k}}\right)$ and $\overline{y}^{(i)}_{k}\geq-{\sigma}{Q}^{-1}\left(\sqrt{1-Pe_{k}}\right)$ , which lead to linear constraints on input signal $\mathbf{x}$ . This region with conservative approximation is denoted as ${\cal C}(d_{k}=D_{4})$ , see the open shadow area in Fig. 3.

By the same approach, relaxed constraint regions associated with the other constellation points can be derived and characterized as:

(i)

${\cal C}(d_{k}=D_{1})=\left\{(\overline{y}^{(r)}_{k},\overline{y}^{(i)}_{k}):\overline{y}^{(r)}_{k}\leq-I,\overline{y}^{(i)}_{k}\leq-I\right\}$ ; 2. (ii)

${\cal C}(d_{k}=D_{2})=\left\{(\overline{y}^{(r)}_{k},\overline{y}^{(i)}_{k}):\overline{y}^{(r)}_{k}\leq-I,-\overline{y}^{(i)}_{k}\leq-I\right\}$ ; 3. (iii)

${\cal C}(d_{k}=D_{3})=\left\{(\overline{y}^{(r)}_{k},\overline{y}^{(i)}_{k}):-\overline{y}^{(r)}_{k}\leq-I,\overline{y}^{(i)}_{k}\leq-I\right\}$ ,

with definition $I=-{\sigma}{Q}^{-1}\left(\sqrt{1-Pe_{k}}\right)$ . Note that the exact areas of these regions only depend on the SEP targets.

IV-A2 16-QAM

We now turn to the case where $d_{k}$ is drawn from the 16-QAM constellation set ${\cal{D}}_{k}=\left\{D_{m}:m=1,\dots,16\right\}$ as shown in Fig. 4. The constraint region on $\overline{y}_{k}$ can be characterized based on procedures similar to those for the 4-QAM case. But calculation requires some care, since the decision regions ${\cal A}(d_{k})$ for inner points and outer points are of different shapes (see the regions partitioned by the green dashed lines in Fig. 4) and the exact areas of these regions depend on the scaling factor $\beta_{k}$ . For the sake of conciseness, the derivation of the constraint region ${\cal B}(d_{k})$ is deferred to Appendix A. Taking symbols $\left\{D_{11},D_{12},D_{16}\right\}$ as examples, we plot the resulting regions ${\cal B}\left(d_{k}\right)$ in Fig. 4. Again, one can approximate these regions with a polytype ${\cal C}(d_{k})$ for each, leading to linear constraints on input signal ${\bf x}$ .

In general, the exact areas of ${\cal C}(d_{k})$ depend on the scaling factor $\beta_{k}$ . It is clear that, with minimum $\beta_{k}^{-}$ as defined by (26), constraint region ${\cal C}(d_{k})$ corresponds to a strict equality constraint on $\overline{y}_{k}$ , i.e., $\overline{y}_{k}=D_{m}$ , for the center points $m\in\{6,8,14,16\}$ ; for the side points, e.g., $D_{12}$ , ${\cal C}(d_{k})$ shrinks to a line: $-\overline{y}^{(r)}_{k}\leq-2\beta_{k}^{-}-I$ and $\overline{y}^{(i)}_{k}=\beta_{k}^{-}$ , which implies doing a zero-forcing for the imaginary part while having a relaxed constraint on the real part; and for the corner points, e.g., $D_{11}$ , ${\cal C}(d_{k})$ becomes: $-\overline{y}^{(r)}_{k}\leq-2\beta_{k}^{-}-I,-\overline{y}^{(i)}_{k}\leq-2\beta_{k}^{-}-I$ , where we recall that $I=-{\sigma}{Q}^{-1}\left(\sqrt{1-Pe_{k}}\right)$ .

To meet a given SEP target $Pe_{k}$ for MS $k$ , one could certainly adopt a scaling factor larger than the minimum $\beta_{k}^{-}$ for the transmission. But such a choice may affect the power efficiency of the system. Fig. 5 plots two instances of constraint regions ${\cal B}(d_{k})$ for $Pe_{k}=10^{-3}$ when the constellation is scaled up with $\beta_{k}=1.05\beta_{k}^{-}$ (blue curves) and $\beta_{k}=1.20\beta_{k}^{-}$ (red curves). It can be seen that with a larger $\beta_{k}$ above $\beta_{k}^{-}$ , when a center constellation point is transmitted, the constraint region is relaxed from a single point to a circle-type region centered on the symbol. Potential benefits can be accrued from the resulting enlarged feasible region. However, when the corner points are transmitted, the corresponding constraint regions always shrink as the constellation is scaled up. In this case, power efficiency loss may be induced because of the reduced feasible optimization space. When one of the side points is transmitted, it is unclear how the performance reacts, as the constraint region with larger $\beta_{k}$ partially overlaps with that for a smaller $\beta_{k}$ . Nevertheless our experiments have indicated that when the transmitted data symbols are randomly and uniformly generated, scaling up the constellation with $\beta_{k}>\beta_{k}^{-}$ brings little further power saving. Hence we will use the minimum scaling, $\beta_{k}^{-}$ , for the standard 16-QAM in what follows.

IV-B Problem Reformulation with Conservative Approximation

With the conservative approximation, the SEP constraints in (30) can be translated into a set of linear inequality/equality constraints on vector ${\bf x}$ . We now present the resulting optimization problem.

For ease of exposition, we stack the real and the imaginary parts of each $x_{n}$ into a real vector $\tilde{\mathbf{x}}$ , i.e., $\tilde{\mathbf{x}}=\left[\Re{\{x_{1}\}},\Im{\{x_{1}\}},\ldots,\Re{\{x_{N}\}},\Im{\{x_{N}\}}\right]^{T}\in{\mathbb{R}}^{2N\times 1}$ . For the $4$ -QAM signaling, the real and imaginary parts of coded signal $x_{n}$ are associated with different inequalities. For the $16$ -QAM signaling, with $\beta_{k}=\beta_{k}^{-}$ , the real and the imaginary parts of $x_{n}$ are associated with either one equality or an inequality depending on the data symbols; when $\beta_{k}>\beta_{k}^{-}$ , at most two inequalities are introduced for either the real or the imaginary part. Therefore, the optimization problem of (30) can be generally reformulated as follows:

[TABLE]

where for the 4-QAM system, we have $\mathbf{A}=\{a_{i,j}\}\in\mathbb{R}^{2K\times 2N}$ , $\mathbf{c}=\{c_{i}\}\in\mathbb{R}^{2K\times 1}$ , $\mathbf{B}={\bf 0}$ and ${\bf e}=0$ , while for the 16-QAM system, we have $\mathbf{A}=\{a_{i,j}\}\in\mathbb{R}^{4K\times 2N}$ , $\mathbf{c}=\{c_{i}\}\in\mathbb{R}^{4K\times 1}$ , $\mathbf{B}=\{b_{i,j}\}\in\mathbb{R}^{2K\times 2N}$ and $\mathbf{e}=\{e_{i}\}\in\mathbb{R}^{2K\times 1}$ . The precise definitions of the matrices and the vectors involved depend on constraint regions ${\cal C}(d_{k})$ , $\forall k$ , as defined in Section IV-A1 and Section IV-A2 for the 4-QAM and 16-QAM signalling, respectively. Whether or not the inequality/equality constraints are active will depend on the transmitted data associated with the constellation. It is possible that some of the constraints may not be active, in which case the corresponding row entries of $\mathbf{A}$ / $\mathbf{B}$ and $\mathbf{c}$ / $\mathbf{e}$ are padded with zeros. In addition, matrices $\mathbf{A}$ / $\mathbf{B}$ enjoy the same sparsity as the system matrix ${\bf H}$ . In particular, when indexing matrices $\mathbf{A}$ / $\mathbf{B}$ , the index pair $\left(i,j\right)$ corresponds to a particular MS-subcarrier pair, and thus $a_{i,j}/b_{i,j}=0$ whenever the particular subcarrier is not allocated to the MS.

As an illustrative example, we consider a 16-QAM system with $N=3$ subcarriers and $K=2$ MSs. Each MS is allocated $L=2$ subcarriers: The first MS is allocated subcarriers $1$ and $2$ , while the second MS is allocated subcarriers $2$ and $3$ . The effective channel matrix ${\bf H}$ is assumed to be

[TABLE]

The data symbol intended for MS $1$ and $2$ is $d_{1}=D_{16}$ and $d_{2}=D_{11}$ , respectively. Both MSs require the same SEP target $Pe$ and thus employ the same scaling factor $\beta=\beta_{1}=\beta_{2}$ . The noiseless received components at MSs are calculated as

[TABLE]

Therefore, according to the constraint regions constructed (see Section IV-A2 and also Appendix A), we have:

[TABLE]

which, by simple algebra, can be further translated into a set of general linear constraints as in (35) with ${\bf B}={\bf 0}$ , ${\bf e}={\bf 0}$ ,

[TABLE]

As we can observe, matrix ${\bf A}$ inherits sparsity from the effective channel matrix ${\bf H}$ ; transmission of $d_{1}=D_{16}$ and $d_{2}=D_{11}$ invokes four and two inequality constraints, respectively, and there is no strict equality constraint in this example.

V Precoding Algorithm Design with Parallel Computation Units

Problem ${\bf\cal P}_{1}$ is strictly convex and can be solved via a number of standard algorithms, such as the interior-point algorithm [30]. Most of these algorithms are designed for centralized implementation and could be efficient enough for a small-scale problem. However, as the problem dimension increases, the computational complexity may be prohibitive. More importantly, the sparsity inherent to the problem may not be well exploited in standard solvers. Given these observations, we are interested in developing a precoding algorithm that leverages the sparsity to reduce complexity and is suitable for solving large-dimension problems using parallel computation units. These units may correspond to parallel processor cores (threads) at the BS computer [31] or parallel processors at the cloud to which the BS is connected [32].

V-A Algorithm Design

We now detail the precoding algorithm with a focus on the general problem as if all inequalities and equalities in (35) were activated. The key technique is the dual decomposition approach, see, e.g., [33]. Recalling the graph representation ${\cal G}$ introduced (see Fig. 2), we can map all PSNs and OSNs to parallel computation units.

We start with forming the Lagrangian function:

[TABLE]

where ${\bm{\lambda}}\in\mathbb{R}^{4K\times 1}\succeq 0$ and ${\bm{\nu}}\in\mathbb{R}^{2K\times 1}$ are Lagrangian multipliers (dual variables), among which each pair of primal variables ${\mathbf{\tilde{x}}}_{[2n-1:2n]}$ is associated with PSN computation unit $n$ ( $n=1,\dots,N$ ), while each tuple of dual variables $\{{\bm{\lambda}}_{[4k-3:4k]},{\bm{\nu}}_{[2k-1:2k]}\}$ is associated with OSN computation unit $k$ ( $k=1,\dots,K$ ). The dual problem is then defined as

[TABLE]

with $g(\bm{\lambda},\bm{\nu})=\min_{\mathbf{\tilde{x}}}\cal{L}\left(\mathbf{\tilde{x}},\bm{\lambda},\bm{\nu}\right)$ being the dual objective function.

Then one can solve the original problem by finding the optimal dual variables in an iterative manner. Specifically, at the $t$ th iteration, for fixed dual variables $\bm{\lambda}^{(t)}$ and $\bm{\nu}^{(t)}$ , to attain the minimization of Lagrangian, one sets the first-order derivative of the Lagrangian to zero, which leads to

[TABLE]

or more explicitly,

[TABLE]

where ${\cal I}(\tilde{x}_{l})$ and ${\cal I}^{\prime}(\tilde{x}_{l})$ denote the collection of indices of dual variables $\lambda$ and $\nu$ that have interactions with $\tilde{x}_{l}$ , respectively, according to the graph ${\cal G}$ . Note that PSN unit $n$ is in charge of computing the pair $\tilde{{\bf x}}^{*(t)}_{[2n-1:2n]}$ , $n=1,\dots,N$ . The corresponding dual function is given by:

[TABLE]

The dual variables are then updated by the OSN units in a parallel manner according to

[TABLE]

where ${\cal I}(\lambda_{i})$ and ${\cal I}(\nu_{j})$ denote the collection of indices of primal variables that have interactions with $\lambda_{i}$ and $\nu_{j}$ , respectively; notation $\left[.\right]^{+}$ denotes the projection onto the nonnegative orthant, $\kappa=\left(\|\bar{\mathbf{A}}\bar{\mathbf{A}}^{T}\|_{1}\|\bar{\mathbf{A}}\bar{\mathbf{A}}^{T}\|_{\infty}\right)^{1/2}$ with $\bar{\mathbf{A}}=[\mathbf{A}^{T},\mathbf{B}^{T}]^{T}$ and

[TABLE]

This dual-variable updating rule offers faster convergence speed than the conventional gradient updating [33] as shown in [34]. The algorithm described is summarized in Table I.

V-B Complexity Analysis

To quantify the complexity of the algorithm, we distinguish the communication overhead and computational complexity.

In the algorithm, to update its primal variables ${\mathbf{\tilde{x}}}_{[2n-1:2n]}$ via (V-A), PSN unit $n$ only needs to gather dual variables from its neighboring OSNs; therefore, the number of messages passed to PSN unit $n$ depends on the number of active dual variables and is at most ${4{\left|{{\cal I}(x_{n})}\right|}}$ for the 16-QAM, and $2{\left|{{\cal I}(x_{n})}\right|}$ for the 4-QAM. To update dual variables $\{{\bm{\lambda}}_{[4k-3:4k]},{\bm{\nu}}_{[2k-1:2k]}\}$ , OSN unit $k$ only needs to collect primal variables from its neighboring PSNs; therefore, the number of messages passed to OSN unit $k$ is at most $2{\left|{{\cal I}(y_{k})}\right|}$ for both the 4/16-QAM. Thus, with $T$ iterations, the total number of message-passings across computation units is ${\mathcal{O}}(4KLT)$ and ${\mathcal{O}}(6KLT)$ for the 4-QAM and 16-QAM system, respectively, where we have used the fact that $\sum\nolimits_{n=1}^{N}{{\left|{{\cal I}(x_{n})}\right|}}=\sum\nolimits_{k=1}^{K}{{\left|{{\cal I}(y_{k})}\right|}}=KL$ with $L$ being the number of non-zeroes in each signature.

Table II summarizes the computational complexity for the algorithm proposed. To update each of its primal variables via (V-A), PSN unit $n$ needs at most $(4\left|{{\cal I}(x_{n})}\right|-1)$ additions and $4\left|{{\cal I}(x_{n})}\right|$ multiplications for the 16-QAM, and $(2\left|{{\cal I}(x_{n})}\right|-1)$ additions and $2\left|{{\cal I}(x_{n})}\right|$ multiplications for the 4-QAM. On the other hand, to update each of its dual variables via (V-A), OSN unit $k$ needs at most $(6\left|{{\cal I}(y_{k})}\right|+5)$ additions and $(4\left|{{\cal I}(y_{k})}\right|+2)$ multiplications for both the 4/16-QAM. Overall, the algorithm involves ${\mathcal{O}}(16KLT)$ additions and multiplications for the 4-QAM system, and involves ${\mathcal{O}}(20KLT)$ additions and multiplications for the 16-QAM system, where $T$ is the number of iterations. It is clear that the more sparse the signatures are, the less overall communication overhead and computational complexity are required to generate the precoded symbols in proposed algorithm.

For comparison, we note that the conventional ZF precoding of (21) has a computational complexity of ${\mathcal{O}}(\frac{8}{3}K^{3}+4NK^{2})$ to compute precoding matrix ${\bf W}={\bf H}^{\dagger}({\bf H}{\bf H}^{\dagger})^{-1}$ and additional complexity of ${\mathcal{O}}(4KN)$ to generate each precoded symbol vector via ${\bf x}={\bf W}{\bf d}$ . Consider a transmission frame that consists of $T_{s}$ 4-QAM symbols intended for each MS and assume channel ${\bf H}$ remains unchanged during the frame. The ratio of the complexity of the proposed scheme to that of the ZF approach is thus quantified by $\rho={\mathcal{O}}(16KLTT_{s})/{\mathcal{O}}(\frac{8}{3}K^{3}+4NK^{2}+4KNT_{s})$ . It is clear that the smaller $L$ , the smaller $\rho$ will be. In particular, $\rho\approx 4LT/K$ for a fully loaded system with $N=K$ and sufficiently large $T_{s}$ . As an example with $K=32$ , $L=4$ and $T=100$ iterations, ratio $\rho\approx 50$ , which indicates the proposed scheme has approximately $50$ times the complexity of ZF precoding. Despite the increase in complexity, the proposed scheme is able to provide enormous transmit power reduction over ZF precoding and is more robust against imperfect channel state estimation, as will be shown later in Section VII.

VI Sparse MC-CDMA with Replica Constellations

In the previous sections, we have mainly focused on the system with standard QAM constellations for which convex constraints on the precoded vector are constructed according to the SEP targets. In this section, we assume the system adopts replica constellations, where each ${\cal D}_{k}$ is the periodic extension of a regular QAM constellation along the real and imaginary axes. We propose an optimized Tomlinson-Harashima Precoding (THP) under the SEP constraints by applying a similar approach we have used for the system with standard QAM constellations.

VI-A THP-Basics

We first briefly review some basic concepts related to THP, see, e.g., [35].

In general, the replica constellation point ${d}\in{\cal D}_{k}$ can be represented as:

[TABLE]

where $D_{l}$ corresponds to a regular point in the scaled $M$ -QAM constellation under scaling $\beta_{k}$ ( $l=1,\dots,M$ ), and $\left\{a_{\text{R}},a_{\text{I}}\right\}$ corresponds to an arbitrary integer pair, see Fig. 6 for a visual illustration when $M=4$ . It is noted that decision regions associated with all replica points are identical closed squares with side length $2\beta_{k}$ .

The THP is normally done in a successive manner in which interference created by previous users’ transmissions is pre-cancelled to facilitate the transmission for the current user at each stage. The encoding is accommodated by the replica constellation and modulo-operation at the transmitter. Specifically, let the channel matrix be represented as $\mathbf{H}^{\dagger}=\mathbf{F}\mathbf{R}$ as a result of QR factorization, where $\mathbf{F}$ is a unitary matrix and $\mathbf{R}$ is an upper triangular matrix. Then $\mathbf{B}=\mathbf{H}\mathbf{F}=\mathbf{R}^{\dagger}$ is a lower triangular matrix. The successive precoding operates as

[TABLE]

where ${d}_{k}\in{\cal D}_{k}$ is the replica point carrying information for MS $k$ and $\left[u\right]_{p_{k}}$ is the modulo operation operated on complex number $u$ with respect to basis ${p_{k}}$ and is defined as:

[TABLE]

with ${p_{k}}=\sqrt{M}\beta_{k}$ . The transmit signal is then formed by multiplying $\mathbf{F}$ with $\bar{\mathbf{x}}$ , i.e., $\mathbf{x}=\mathbf{F}\bar{\mathbf{x}}$ . In this way, at the receiver side, no MS experiences inter-user interference because of the pre-cancellation operations done at the BS. It is remarked that since THP is performed in a successive manner, different user orderings may lead to different performance. To find the optimal ordering, one needs to do an exhaustive search over all possible combinations, which is generally infeasible as $K$ goes large. In this work we simply adopt the suboptimal V-BLAST (VB) ordering [36].

VI-B Optimized THP

Under the SEP constraints, we can formulate a precoding optimization problem similar to ${\cal P}_{1}$ . The idea is that instead of choosing a minimum scaling factor $\beta_{k}^{-}$ for the constellation and performing a zero-forcing THP, one can relax the constraints on the noiseless output components and introduces room for optimizing the input signals as we scale up the constellation.

In particular, with the replica constellation scaled up, the resulting constraint regions become boxes centered at each replica point as constructed and approximated similarly for the inner points of the 16-QAM constellation, see Fig. 6. The optimization problem is then formulated as:

[TABLE]

where matrices $\mathbf{A}=\{a_{i,j}\}\in\mathbb{R}^{4K\times 2N}$ and $\mathbf{c}=\{c_{i}\}\in\mathbb{R}^{4K\times 1}$ are formed according to the constraints

[TABLE]

where parameter $\delta_{0}$ determines the size of the constraint box and is chosen to satisfy:

[TABLE]

and the set of information-carrying replica points $\{{d}_{k},k=1,\dots,K\}$ is determined from the ZF-THP encoding procedure. Therefore, the precoding algorithm proposed in Section V can be applied here to calculate the optimized THP precoded vector.

VII Simulation Results

We now present numerical results to demonstrate the effectiveness of the precoding schemes proposed for the sparse MC-CDMA system.

VII-A Simulation Setup

In the simulation, the noise variance $N_{0}$ is set to unity. The total number of subcarriers is fixed with $N=32$ and the number of MSs $K\leq N$ is allowed to vary. Different MSs experience different frequency-selective fading channels. Specifically, the channel frequency response between the BS and MS $k$ is generated according to

[TABLE]

where ${\bf{g}}_{k}=\left[{g_{k,0},\cdots,g_{k,{\tilde{Q}}-1}}\right]^{T}$ represents the discrete-time channel response consisting of ${\tilde{Q}}$ taps; components $\{g_{k,q}\}$ are modeled as independent zero-mean Gaussian random variables, whose individual variance equals $\{\bar{\lambda}e^{-\frac{q}{4}}\}$ with normalization factor $\bar{\lambda}$ chosen such that ${\mathsf{E}}\left[{\left\|{{\bf{g}}_{k}}\right\|^{2}}\right]=1$ . In the simulation, ${\tilde{Q}}=8$ is adopted for the channel generation. It is assumed that there is no inter-symbol-interference and inter-carrier-interference in the system.

Unless stated otherwise, for any fixed system configuration, we simulate $1000$ transmission slots, under each of which random data and random channel are independently generated for each MS. In addition, we produce $10$ random regular signature matrix realizations as defined in Section II-C and thus every $100$ transmission slots share the same signature matrix. The system transmitted power consumption presented shortly is averaged over all transmission slots. For the precoding algorithm proposed, the calculation terminates at iteration $t$ if the normalized improvement of dual objective $\left|{g^{(t)}-g^{*}}\right|/g^{*}\leq\delta=10^{-4}$ , where $g^{*}\buildrel\Delta\over{=}\mathop{\max}\limits_{i\in\left\{{1,...,t-1}\right\}}g^{(i)}$ .

VII-B System with Standard Constellation

Fix SEP target $Pe_{k}=10^{-3}$ for all MSs and vary $K\in[24:32]$ . In the sparse MC-CDMA, different levels of sparsity, e.g., $L=4$ and $L=8$ , are considered. The case with $L=32$ , referred to as the dense MC-CDMA, is also considered for the purpose of comparison. In addition, we have also compared with two conventional precoding schemes including ZF and the optimized RZF in form of:

[TABLE]

where $\mathbf{d}=[d_{1},\dots,d_{K}]^{T}$ with $d_{k}$ denotes the transmitted data symbol for MS $k$ and is drawn from a scaled version of standard constellation by $\beta^{-}_{k}$ , ${\mathbf{I}}_{K}$ is a $K\times K$ identity matrix, and $k_{1},k_{2}$ are two non-negative parameters to be optimized subject to the SEP constraints in (35). Note that the optimized RZF encompasses the conventional regularized ZF precoder [12] with $k_{1}=1$ and also the minimum-mean-square-error (MMSE) precoder [20] with $k_{1}=1$ and $k_{2}=K\sigma^{2}/P.$

Fig. 7-(a) and Fig. 7-(b) plot the transmit power consumption at the BS versus $K$ under different setups and precoding schemes for systems with 4-QAM and 16-QAM, respectively. Two important observations are made as follows.

First, power reduction from the proposed precoding over ZF and RZF is clearly evident for all load and sparsity combinations considered. In particular, for any fixed $L$ , the reduction increases as $K$ grows. For instance, when $K=N=32$ and $L=8$ , for the 4-QAM, we have $18.8$ dB and $13.6$ dB reduction compared with ZF and RZF, respectively, while for the 16-QAM, we have $16$ dB reduction compared with both schemes, noting that the optimized RZF solutions are degraded and coincide with the ZF solutions in this case. As $K\to N$ , the effective channel matrix ${\bf H}$ is increasingly likely to be poorly conditioned. Hence, the inefficiency of conventional schemes (in particular ZF) becomes pronounced. However, the precoding proposed is not sensitive to the conditional number of ${\bf H}$ and always attains the best performance.

Second, for the sparse MC-CDMA system, there is a trend that a denser signature (a larger value of $L$ ) leads to a smaller power consumption needed. For instance, the system with $L=4$ consumes slightly more power than the system with $L=8$ to achieve the same SEP target under both the 4-QAM and 16-QAM systems. However, to attain comparable power efficiency to the dense MC-CDMA, the signatures can still be relatively sparse ( $L=8$ in our examples), yielding considerable reduction in precoding complexity. This observation also indicates that the sparse MC-CDMA system with proper choice of $L$ would attain almost the same link throughput as that of the dense MC-CDMA system under the same transmit power budget.

Fig. 8 plots the power consumption versus different SEPs with $K=N=32$ and $L=8$ , which further confirms the superiority of the proposed scheme as compared to baselines ZF and RZF under different SEP targets for both 4-QAM and 16-QAM systems.

VII-C System with Replica Constellation

We now consider the system with replica constellation, where system parameters $N=32$ and $L=8$ . All MSs request the same minimum SEP target $Pe=10^{-3}$ . Under this SEP requirement, a uniform scaling factor across all MSs is chosen such that the power consumption is minimized for the proposed optimized THP. To perform the precoding optimization, we use an algorithm similar to that for systems with standard constellations. Therefore, signature sparsity is leveraged to reduce precoding complexity, as it was before.

Fig. 9-(a) and Fig. 9-(b) plot the transmit power consumption versus $K$ under both the ZF- and optimized THP schemes for the system with 4-QAM and 16-QAM replica points, respectively. The performance of the proposed scheme under standard constellations is also included here for the purpose of comparison. It is seen that the optimized THP is able to provide significant power reduction over the proposed scheme under standard constellations. This further reduction, albeit appealing, does not come for free and has to be paid with more sophisticated encoding and decoding operations in THP schemes as described in Section VI. It is also observed that the optimized THP generally outperforms ZF-THP in power efficiency and the exact gain depends on the system load. In particular, the former provides roughly $1.5$ dB power reduction over the latter for 4-QAM replica and roughly $0.85$ dB reduction for 16-QAM replica in a full-load system. The ZF-THP is already very power-efficient, yet the proposed THP is seen here to provide further reduction in transmit power.

VII-D Bit Error Rate (BER) Results and Impact of Imperfect Channel Estimation

So far, we have demonstrated the power efficiency of the proposed precoding under different uncoded SEP targets. We now evaluate the impact of the proposed scheme on another practically important performance metric in terms of uncoded bit error rate (BER). The BER is calculated and averaged over $10^{6}$ realizations of transmissions. Fig. 10 (a) and (b) depict the average BER (at a typical MS) as a function of power consumption for a system with standard/replica 4-QAM and 16-QAM, respectively. Consistent with the previous observations, the proposed optimized precoding significantly outperforms ZF precoding in terms of power efficiency to attain the same BER target under standard QAM constellations. The optimized THP is more power-efficient than the ZF-THP, and both of them generally outperform the optimized precoding under standard constellations but at the cost of increase complexity as explained before.

Next, we evaluate the impact of channel estimation error (i.e., channel uncertainty) on the performance of the schemes considered. Let ${\bf{\hat{H}}}$ denote the estimated sparse channel matrix. Each nonzero entry ${\hat{h}}_{k,n}$ of ${\bf{\hat{H}}}$ is a noisy version of the perfect ${h}_{k,n}$ of ${\bf{H}}$ . To model the uncertainty, we assume ${\hat{h}}_{k,n}$ is generated according to: ${\hat{h}}_{k,n}={h}_{k,n}+z_{k,n}$ , where $z_{k,n}\sim{\cal CN}(0,\sigma_{e}^{2})$ represents the complex Gaussian estimation error with variance $\sigma_{e}^{2}$ and $\{z_{k,n},\forall k,\forall n\}$ are independently and identically distributed. The average normalized channel uncertainty is then defined as $\tau={\mathsf{E}}[10\log_{10}(\|{\bf\hat{H}}-{\bf H}\|_{2}^{2}/\|{\bf H}\|_{2}^{2})]$ in dB. For all schemes evaluated, the SEP target $Pe_{k}$ is set to $10^{-2}$ so that the corresponding BER is on the order of ${10^{-3}}$ , if perfect channel state information is available. Fig. 11 (a) and (b) depict the real BER (at a typical MS) versus different levels of channel uncertainty for a sparse MC-CDMA system with standard/replica 4-QAM and 16-QAM signaling, respectively. It can be seen that as the channel uncertainty increases, the real BER of both the ZF approach and the proposed precoding scheme degrade. However, the proposed scheme always outperforms its ZF counterpart and exhibits much better robustness against imperfect channel estimation, particularly for a system with standard constellations. The proposed scheme is thus not only more power-efficient but also more robust against channel uncertainty.

VIII Conclusions

In this work, we have introduced a sparse MC-CDMA downlink model and proposed a power-efficient precoding method under minimum symbol error probability (SEP) requirements at the MSs. It has been shown that when the system load is high, the proposed precoder significantly reduces the transmission power over (regularized) zero-forcing based precoders under the same SEP targets. It has also been shown that using the proposed precoder, the sparse MC-CDMA system with a proper choice of sparsity level attains almost the same power efficiency and link throughput as that of the dense MC-CDMA, but with lower complexity. These features, along with the fact that channel measurements may be simplified with sparse signatures, add to the practical appeal of the sparse MC-CDMA and make it a valuable candidate for future-generation wireless communication systems.

Appendix A Critical Points on the Boundary of ${\cal B}(d_{k})$ for the 16-QAM Signaling

For the 16-QAM signaling, the SEP target (20) indicates that

[TABLE]

where $\rho^{(r)}_{+}/\rho^{(i)}_{+}$ and $\rho^{(r)}_{-}/\rho^{(i)}_{-}$ relate to the decision regions and parameterize the upper/lower bounds for the integrals. Table III specifies these parameters for different constellation points.

Given symbol $d_{k}\in{\cal D}_{k}$ and a target $Pe_{k}$ , one can determine the precise constraint region ${\cal B}(d_{k})$ on noiseless output $\overline{y}_{k}$ at MS $k$ from inequality (A). In particular, the boundary of the region is determined by the equality ${\mathcal{O}^{(r)}}{\mathcal{O}^{(i)}}=1-Pe_{k}$ in (A). In the following, we explain how to determine a set of critical points on the boundary of the corresponding regions for three representative constellation points $\{D_{11},D_{12},D_{16}\}$ .

For the corner constellation point $D_{11}$ , the critical boundary points can be determined according to three different combinations of $\left(\mathcal{O}^{(r)},\mathcal{O}^{(i)}\right)$ :

(i)

$\left(1-Pe_{k},1\right)$ : $\overline{y}^{(r)}_{k}=2\beta_{k}-{\sigma}{Q}^{-1}(1-Pe_{k})$ , $\overline{y}^{(i)}_{k}=+\infty$ ; 2. (ii)

$\left(1,1-Pe_{k}\right)$ : $\overline{y}^{(r)}_{k}=+\infty$ , $\overline{y}^{(i)}_{k}=2\beta_{k}-{\sigma}{Q}^{-1}(1-Pe_{k})$ ; 3. (iii)

$\left(\sqrt{1-Pe_{k}},\sqrt{1-Pe_{k}}\right)$ :

$~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\overline{y}^{(r)}_{k}=\overline{y}^{(i)}_{k}=2\beta_{k}-{\sigma}{Q}^{-1}\left(\sqrt{1-Pe_{k}}\right)$ ,

where $Q^{-1}(.)$ denotes the inverse of the standard $Q$ -function.

For the center constellation point $D_{16}$ , the critical boundary points can be found by examining the following combinations of $\left(\mathcal{O}^{(r)},\mathcal{O}^{(i)}\right)$ :

(i)

$\left(\frac{1-Pe_{k}}{\alpha},\alpha\right)$ : $\overline{y}^{(r)}_{k}=\beta_{k}\pm\delta_{1}$ , $\overline{y}^{(i)}_{k}=\beta_{k}$ ; 2. (ii)

$\left(\alpha,\frac{1-Pe_{k}}{\alpha}\right)$ : $\overline{y}^{(r)}_{k}=\beta_{k}$ , $\overline{y}^{(i)}_{k}=\beta_{k}\pm\delta_{1}$ ; 3. (iii)

$\left(\sqrt{1-Pe_{k}},\sqrt{1-Pe_{k}}\right)$ : $\overline{y}^{(r)}_{k}=\beta_{k}\pm\delta_{0}$ , $\overline{y}^{(i)}_{k}=\beta_{k}\pm\delta_{0}$ ,

where $\alpha=\frac{1}{\sqrt{2\pi}\sigma}\int_{-\beta_{k}}^{\beta_{k}}e^{-\frac{v^{2}}{2\sigma^{2}}}dv\geq\sqrt{1-Pe_{k}}$ , parameters $\delta_{0}$ and $\delta_{1}$ are chosen to satisfy:

[TABLE]

For the side constellation point $D_{12}$ , the critical boundary points can be found similarly by examining the following combinations of $\left(\mathcal{O}^{(r)},\mathcal{O}^{(i)}\right)$ :

(i)

$\left(\alpha,\frac{1-Pe_{k}}{\alpha}\right)$ : $\overline{y}^{(r)}_{k}=2\beta_{k}-{\sigma}{Q}^{-1}\left(\frac{1-Pe_{k}}{\alpha}\right)$ , $\overline{y}^{(i)}_{k}=\beta_{k}$ ; 2. (ii)

$\left(\sqrt{1-Pe_{k}},\sqrt{1-Pe_{k}}\right)$ :

$~{}~{}~{}~{}~{}~{}\overline{y}^{(r)}_{k}=2\beta_{k}-{\sigma}{Q}^{-1}\left(\sqrt{1-Pe_{k}}\right)$ , $\overline{y}^{(i)}_{k}=\beta_{k}\pm\delta_{0}$ ; 3. (iii)

$\left(1,1-Pe_{k}\right)$ : $\overline{y}^{(r)}_{k}=+\infty$ , $\overline{y}^{(i)}_{k}=\beta_{k}\pm\delta_{2}$ ,

where parameter $\delta_{2}$ is chosen to satisfy:

[TABLE]

Bibliography36

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] N. Yee, J. Linnartz, and G. Fettweis, “Multi-carrier CDMA in indoor wireless radio networks,” in Proc. of IEEE Personal Indoor and Mobile Radio Commun. , 1993.
2[2] S. Hara and R. Prasad, “Overview of multicarrier CDMA,” IEEE Commun. Mag. , vol. 35, no. 12, pp. 126–133, 1997.
3[3] A. M. Tulino, L. Li, and S. Verdú, “Spectral efficiency of multicarrier CDMA,” IEEE Trans. Inf. Theory , vol. 51, no. 2, pp. 479–505, 2005.
4[4] K. Fazel and S. Kaiser, Multi-carrier and spread spectrum systems: from OFDM and MC-CDMA to LTE and Wi MAX . Wiley, 2008.
5[5] M. Yoshida and T. Tanaka, “Analysis of sparsely-spread CDMA via statistical mechanics,” in Proc. IEEE Int. Symp. Inf. Theory , 2006.
6[6] A. Montanari and D. Tse, “Analysis of belief propagation for non-linear problems: The example of CDMA (or: How to prove Tanaka’s formula),” in Proc. of Inf. Theory and Appl. Workshopp , 2006.
7[7] J. Raymond and D. Saad, “Sparsely spread CDMA–statistical mechanics-based analysis,” J. Phys. A: Math. Theor. , vol. 40, no. 41, pp. 12 315–12 333, 2007.
8[8] D. Guo and C. Wang, “Multiuser detection of sparsely spread CDMA,” IEEE J. Sel. Areas Commun. , vol. 26, no. 3, pp. 421–431, 2008.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Precoding for the Sparsely Spread MC-CDMA Downlink with Discrete-Alphabet Inputs

Abstract

Index Terms:

I Introduction

I-A Motivations and Contributions

I-B Other Related Work

II System Model

II-A Signaling Model

II-B Graph Representation

II-C Sparse Signature Ensemble

III Optimized Precoding with SEP Targets

IV Sparse MC-CDMA with Standard MMM-QAM Constellations

IV-A Translate SEP Targets to Constraints on Noiseless Received Signal Components

IV-A1 4-QAM

IV-A2 16-QAM

IV-B Problem Reformulation with Conservative Approximation

V Precoding Algorithm Design with Parallel Computation Units

V-A Algorithm Design

V-B Complexity Analysis

VI Sparse MC-CDMA with Replica Constellations

VI-A THP-Basics

VI-B Optimized THP

VII Simulation Results

VII-A Simulation Setup

VII-B System with Standard Constellation

VII-C System with Replica Constellation

VII-D Bit Error Rate (BER) Results and Impact of Imperfect Channel Estimation

VIII Conclusions

Appendix A Critical Points on the Boundary of B(dk){\cal B}(d_{k})B(dk​) for the 16-QAM Signaling

IV Sparse MC-CDMA with Standard $M$ -QAM Constellations

Appendix A Critical Points on the Boundary of ${\cal B}(d_{k})$ for the 16-QAM Signaling