Deep Learning Assisted User Identification in Massive Machine-Type   Communications

Bryan Liu; Zhiqiang Wei; Jinhong Yuan; and Milutin Pajovic

arXiv:1907.09735·cs.IT·July 24, 2019

Deep Learning Assisted User Identification in Massive Machine-Type Communications

Bryan Liu, Zhiqiang Wei, Jinhong Yuan, and Milutin Pajovic

PDF

Open Access

TL;DR

This paper introduces a deep learning-enhanced list AMP algorithm for user identification in massive machine-type communications, improving accuracy by identifying and mitigating false alarms through neural network predictions and list decoding techniques.

Contribution

It presents a novel deep learning aided list AMP algorithm that detects and suppresses suspicious devices to enhance user identification in massive MTC systems.

Findings

01

Improved mean squared error performance over conventional AMP.

02

Effective suppression of false alarms in user detection.

03

Enhanced robustness in massive device scenarios.

Abstract

In this paper, we propose a deep learning aided list approximate message passing (AMP) algorithm to further improve the user identification performance in massive machine type communications. A neural network is employed to identify a suspicious device which is most likely to be falsely alarmed during the first round of the AMP algorithm. The neural network returns the false alarm likelihood and it is expected to learn the unknown features of the false alarm event and the implicit correlation structure in the quantized pilot matrix. Then, via employing the idea of list decoding in the field of error control coding, we propose to enforce the suspicious device to be inactive in every iteration of the AMP algorithm in the second round. The proposed scheme can effectively combat the interference caused by the suspicious device and thus improve the user identification performance.…

Tables2

Table 1. TABLE I: NMSE performance of AMP-MMSE and GA-LAMP.

Number of iterations	AMP-MMSE (dB)	GA-LAMP (dB)
3	-3.61	-3.60
5	-4.55	-4.80
10	-5.34	-6.22
20	-5.67	-6.99

Table 2. TABLE II: The structure of DNN and the hyper-parameters for training.

Number of hidden layer	2
Hidden layer size	$2 \times (M + N)$
Hidden layer activation function	Hyperbolic tangent
Output layer activation function	Sigmoid
Optimizer	Root Mean Square Propagation
Learning rate	0.001
Batch size	600
AMP-MMSE iteration	20

Equations20

\mathcal{Q}({\kappa},b)=\Delta\cdot\bigg{(}\lfloor\frac{\kappa_{r}}{\Delta}+\frac{1}{2}\rfloor\bigg{)}+\Delta\cdot\bigg{(}\lfloor\frac{\kappa_{\jmath}}{\Delta}+\frac{1}{2}\rfloor\bigg{)}\jmath,

\mathcal{Q}({\kappa},b)=\Delta\cdot\bigg{(}\lfloor\frac{\kappa_{r}}{\Delta}+\frac{1}{2}\rfloor\bigg{)}+\Delta\cdot\bigg{(}\lfloor\frac{\kappa_{\jmath}}{\Delta}+\frac{1}{2}\rfloor\bigg{)}\jmath,

y = n = 1 \sum N p_{n} a_{n} h_{n} + z = n = 1 \sum N p_{n} x_{n} + z = Px + z, \vspace - 5 mm

y = n = 1 \sum N p_{n} a_{n} h_{n} + z = n = 1 \sum N p_{n} x_{n} + z = Px + z, \vspace - 5 mm

\hat{x}^{t + 1}

\hat{x}^{t + 1}

v^{t + 1}

η (r_{n}^{t}, σ_{h}) = \frac{α r _{n}^{t}}{1 + \frac{1 - ρ}{ρ} β exp ( - γ ∣ r _{n}^{t} ∣ ^{2} )}, \vspace - 5 mm

η (r_{n}^{t}, σ_{h}) = \frac{α r _{n}^{t}}{1 + \frac{1 - ρ}{ρ} β exp ( - γ ∣ r _{n}^{t} ∣ ^{2} )}, \vspace - 5 mm

\hat{x}_{(f)} = argmin_{\overset{x}{^}_{i}} ∣ y - P \hat{x}_{(i)} ∣^{2}, where i \in 1, 2 .

\hat{x}_{(f)} = argmin_{\overset{x}{^}_{i}} ∣ y - P \hat{x}_{(i)} ∣^{2}, where i \in 1, 2 .

\displaystyle\hat{a}_{n}=\Lambda\bigg{(}|\hat{x}_{n}|,\delta=\xi\left(\mathbf{\hat{x}},\rho\right)\bigg{)},

\displaystyle\hat{a}_{n}=\Lambda\bigg{(}|\hat{x}_{n}|,\delta=\xi\left(\mathbf{\hat{x}},\rho\right)\bigg{)},

e_{FAL, n} = \frac{B _{n} - min ( B )}{max ( B ) - min ( B )},

e_{FAL, n} = \frac{B _{n} - min ( B )}{max ( B ) - min ( B )},

L (\hat{e}_{FAL}, e_{FAL})

L (\hat{e}_{FAL}, e_{FAL})

P_{f} = \frac{\sum _{n = 1}^{N} \mathbbm 1 { a ^ _{n} = 1 , a _{n} = 0 }}{\sum _{n = 1}^{N} \mathbbm 1 { a _{n} = 0 }} and

P_{f} = \frac{\sum _{n = 1}^{N} \mathbbm 1 { a ^ _{n} = 1 , a _{n} = 0 }}{\sum _{n = 1}^{N} \mathbbm 1 { a _{n} = 0 }} and

P_{m} = \frac{\sum _{n = 1}^{N} \mathbbm 1 { a ^ _{n} = 0 , a _{n} = 1 }}{\sum _{n = 1}^{N} \mathbbm 1 { a _{n} = 1 }},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Wireless Communication Technologies · Cooperative Communication and Network Coding · Wireless Communication Security Techniques

Full text

Deep Learning Assisted User Identification in Massive Machine-Type Communications

Bryan Liu1, Zhiqiang Wei1, Jinhong Yuan1, and Milutin Pajovic2

1School of Electrical Engineering and Telecommunications, the University of New South Wales

2Mitsubishi Electric Research Laboratories, Cambridge, USA

Email: [email protected]

Abstract

In this paper, we propose a deep learning aided list approximate message passing (AMP) algorithm to further improve the user identification performance in massive machine type communications. A neural network is employed to identify a suspicious device which is most likely to be falsely alarmed during the first round of the AMP algorithm. The neural network returns the false alarm likelihood and it is expected to learn the unknown features of the false alarm event and the implicit correlation structure in the quantized pilot matrix. Then, via employing the idea of list decoding in the field of error control coding, we propose to enforce the suspicious device to be inactive in every iteration of the AMP algorithm in the second round. The proposed scheme can effectively combat the interference caused by the suspicious device and thus improve the user identification performance. Simulations demonstrate that the proposed algorithm improves the mean squared error performance of recovering the sparse unknown signals in comparison to the conventional AMP algorithm with the minimum mean squared error denoiser.

I Introduction

Triggered by explosive applications of Internet-of-Things (IoT), massive machine-type communications (mMTC), where a large number of sensors are envisioned to transmit short messages sporadically [1, 2, 3], have become one of the dominant communication paradigms for future wireless networks. In order to accommodate such a massive connectivity, grant-free random access schemes were proposed and have got industry and academia consensus on its applicability for mMTC[4, 5, 6]. By contrast to grant-based schemes, in grant-free access schemes, each device directly transmits its pilot and payload data in one shot, once it has a transmission demand [5]. In addition, limited by the channel coherence time, it is impossible to allocate orthogonal pilot sequences to massive IoT devices [5]. The employment of non-orthogonal pilot sequences among devices yields the device activity detection and channel estimation as a critical but challenging task[7, 8]. Fortunately, the sparsity in device activity enables the possibility to tackle the aforementioned issues via employing compressed sensing (CS) techniques [5].

Most recently, one of CS techniques, approximate message passing (AMP) with minimum mean squared error (MMSE) denoiser has been employed to identify the device activity [9], via recasting it as a sparse linear inverse problem and exploiting the statistics of the wireless channels. The authors [10] proved the remarkeable performance gain in the user identification via employing the AMP algorithm combined with massive multiple-input multiple output (MIMO) technique. However, the AMP-based user identification highly relies on the assumption of independent and identically distributed (i.i.d.) Gaussian distributed pilot sequences of each device. In practice, the IoT devices usually equip a low-resolution digital-to-analog converter (DAC) to save its cost and power consumption [11], which makes the i.i.d. Gaussian distributed pilot sequence idealistic or unapproachable. In addition, this assumption implies that the correlation structure among pilot sequences has not been exploited by AMP, which means there is potential to further improve the user identification performance.

Deep learning has emerged as a powerful tool to further augment the existing wireless communication technologies [12, 13, 14, 15]. Furthermore, deep learning has been applied for conventional CS algorithms [13, 14, 15], which resulted in significantly improved accuracy and reduced computational complexity. Particularly, in [13], it was shown that the iterations of the AMP algorithm can be unfolded into several layers of neural network. After training the constructed neural network, the “learned AMP” provides increased accuracy. Moreover, in [16], a learned denoising-based AMP (LDAMP) network was proposed. By replacing the denoiser in the AMP algorithm with a neural network, the LDAMP has shown an enhanced performance compared to the conventional denoising-based AMP (D-AMP) [17]. In summary, the aforementioned algorithms utilize deep learning as a tool to optimize the system performance based on a certain training metric, which is commonly the normalized mean squared error (NMSE). Therefore, deep learning has a potential to improve the user identification performance via learning the underlying correlation structure among quantized pilot sequences in mMTC.

In this paper, we propose a deep learning based false alarm likelihood (FAL) estimator to assist conventional AMP algorithm for user identification and channel estimation. In particular, the neural network is designed and trained to estimate the FAL of each device, based on the observations at the receiver and the estimates attained by an AMP algorithm. Then, comparing all the obtained FALs, we can find the suspicious device which is most likely to be falsely alarmed. Via employing the idea of list decoding from the field of error control coding [18], we propose to restart the AMP algorithm with enforcing the suspicious device as inactive. Therefore, we name the proposed scheme as deep learning assisted list AMP (DL-LAMP) algorithm. It is worth noting that compared with finding the most likely miss-detected device and forcing it to be active, finding the suspicious device and enforcing it to be inactive is a simpler approach that does not require any channel state information. Simulation results show that the proposed DL-LAMP algorithm provides up to 0.8 dB performance gain of NMSE at a signal-to-noise ratio (SNR) of 40 dB, compared to the conventional AMP-MMSE algorithm.

II Preliminaries

II-A System Model

Assume $N$ potential devices transmit packets to a common receiver through a multiple access channel. Each device has a transmission probability Pr $(a_{n}=1)=\rho$ , $0\leq\rho\leq 1$ . Here, $a_{n}$ denotes the activity of device $n$ , where $a_{n}\in\{0,1\}$ . Specifically, $a_{n}=0$ indicates that device $n$ is inactive and the device is active if $a_{n}=1$ . We use $K$ to denote the number of active devices. Any active device will transmit a packet which contains a complex $M$ -length pilot sequence $\mathbf{u}_{n}\in\mathbb{C}^{M\times 1}$ . Each element $u_{m,n}$ in the pilot matrix $\mathbf{U}$ is chosen by $u_{m,n}\sim\mathcal{CN}(0,\frac{1}{M})$ , $m\in\{1,2,...,M\}$ and $n\in\{1,2,...,N\}$ , where $\mathcal{CN}(0,\frac{1}{M})$ denotes the circularly symmetric complex Gaussian distribution with zero mean and variance $\frac{1}{M}$ . Consider the complex channel fading coefficient $h_{n}\in\mathbb{C}$ from device $n$ to the receiver at a time slot as $h_{n}\sim\mathcal{CN}(0,1)$ . We have $x_{n}=a_{n}h_{n}$ which captures the joint effect of device activity and channel fading.

The sensors in mMTC are assumed to be low-cost, battery-limited, and thus can only equip a low-resolution DAC. Hence, a quantized pilot matrix is further considered. Define an element-wise quantization function $\mathcal{Q}(\cdot,b)$ with reference to the first argument, where $b$ indicates the number of quantization bits. We consider a uniform quantization function $\mathcal{Q}(\cdot,b)$ which has $2^{b}$ discrete output levels with equal step between adjacent output levels. Since each entry in the pilot matrix obeys a complex Gaussian distribution with a variance of $\frac{1}{M}$ , the quantization outputs are assumed in the range $[-\frac{3}{\sqrt{2M}},+\frac{3}{\sqrt{2M}}]$ for the real and imaginary components, respectively, which covers approximately $99.7\%$ of the pilot realization [19]. The pilot realization out of this range would be saturated to the upper bound and lower bound correspondingly. The quantization function follows the mid-riser rule [20]. Given a complex value $\kappa=\kappa_{r}+(\kappa_{\jmath})\jmath$ , $\mathcal{Q}(\cdot,b)$ returns a quantized complex value by

[TABLE]

where $\Delta=\frac{6}{\sqrt{2M}}\times\frac{1}{2^{b}-1}$ . As a result, the quantized pilot sequence of device $n$ is defined as $\mathbf{p}_{n}=\mathcal{Q}(\mathbf{u}_{n},b)$ and the received signal during pilot transmission is obtained as:

[TABLE]

where $\mathbf{z}\in\mathbb{C}^{M\times 1}$ refers to the additive white Gaussian noise (AWGN) with each entry $z_{m}\sim\mathcal{CN}(0,\sigma_{z}^{2})$ . The pilot matrix $\mathbf{P}=[\mathbf{p}_{1},\mathbf{p}_{2},...,\mathbf{p}_{N}]\in\mathbb{C}^{M\times N}$ gathers the pilot sequences of the devices and the unknown vector $\mathbf{x}=[x_{1},x_{2},...,x_{N}]^{T}\in\mathbb{C}^{N\times 1}$ collects the variables $x_{n}$ to be recovered. The receiver identifies the users’ activity and estimates their channels, i.e., estimates the unknown vector $\mathbf{x}\in\mathbb{C}^{N\times 1}$ , based on the observation $\mathbf{y}\in\mathbb{C}^{M\times 1}$ and the pilot matrix $\mathbf{P}$ . This leads to a under-determined problem due to the fact that $M\ll N$ . In addition, since the number of active devices is much less than that of potential devices, i.e., $K\ll N$ , CS techniques, such as AMP, can be employed to solve this problem.

II-B Overview of AMP Algorithm

First proposed in [21], AMP has been broadly investigated for solving the sparse linear inverse problems. As a low computational complexity algorithm, the AMP algorithm performs iterative updates to recover the sparse unknown signals $\mathbf{x}$ . Define $\mathbf{v}^{t}\in\mathbb{C}^{M\times 1}$ as the residual errors between the observations $\mathbf{y}$ and the corresponding signals of the estimates $\mathbf{\hat{x}}^{t}=[\hat{x}_{1}^{t},\hat{x}_{2}^{t},...,\hat{x}_{N}^{t}]^{T}$ in the $t$ -th iteration. Then, by initializing $\mathbf{\hat{x}}^{0}=\mathbf{0}$ and $\mathbf{v}^{0}=\mathbf{y}$ , the AMP algorithm mainly comprises of two steps of updates:

[TABLE]

where $\langle\cdot\rangle$ denotes the empirical averaging operation, $\eta(\cdot)$ refers to the denoiser function, $\eta^{\prime}(\cdot)$ expresses the first-order derivative of $\eta(\cdot)$ with respect to the first argument, $\mathbf{P}^{*}$ indicates the Hermitian transpose of matrix $\mathbf{P}$ , and $\sigma_{h}$ denotes the standard deviation of the channel fading coefficient.

The denoiser in the AMP algorithm plays an important role for reducing the estimation error and maintaining the sparsity of the unknown vector $\mathbf{x}$ . The MMSE denoiser exploits the prior distribution of the unknown vector $\mathbf{x}$ and thus outperforms the well-known soft thresholding denoiser [22]. In this paper, we consider the AMP algorithm with the MMSE denoiser. In particular, the MMSE denoiser[9, 23] in the $t$ -th iteration can be expressed as:

[TABLE]

where $\alpha=\frac{\sigma^{2}_{h}}{\sigma^{2}_{h}+\tau_{t}^{2}}$ , $\beta=\frac{\sigma^{2}_{h}+\tau_{t}^{2}}{\tau_{t}^{2}}$ and $\gamma=\tau_{t}^{-2}-(\sigma^{2}_{h}+\tau_{t}^{2})^{-1}$ . In addition, the first input of the MMSE denoiser $\mathbf{r}^{t}=[r_{1}^{t},r_{2}^{t},...r_{N}^{t}]^{T}$ can be interpreted as the matched filtered output $\mathbf{r}^{t}=\mathbf{P^{*}}\mathbf{\hat{x}}^{t}+\mathbf{x}^{t}$ , which can be approximately modelled by the estimated signals $\mathbf{x}^{t}$ , impaired by the AWGN. Invoking the state evolution technique [23], it is derived by $\tau_{t+1}^{2}=\sigma_{z}^{2}+\frac{N}{M}\mathbb{E}[|\eta(\mathbf{r}^{t},\sigma_{h})-\mathbf{x}|^{2}]$ with an initialization of $\tau_{0}^{2}=\sigma_{z}^{2}+\frac{N}{M}\mathbb{E}[|x_{n}|^{2}]$ . In practice, an empirical estimation of $\tau_{t+1}^{2}$ can be computed by $\tau_{t+1}^{2}=\frac{1}{M}||\mathbf{v}^{t}||^{2}_{2}$ , where $||\cdot||_{2}^{2}$ denotes the square value of the $\ell_{2}$ -norm. In fact, the state evolution technique [23] can characterize and predict the performance of the AMP algorithm during each iteration. Interested readers are referred to [23] for more details.

II-C Overview of Deep Neural Network

In this section, we briefly describe the essentials of a deep neural network (DNN). Fig. 1 depicts an ordinary structure of a fully connected deep neural network, where the network mainly comprises of three types of neural layers. The input layer takes the features to be processed and conveys the processed information to several hidden layers. The outputs of each hidden layer are inputs to the following layer. The last hidden layer is connected to the output layer which yields the network’s output. Each layer comprises of neurons such that each neuron computes an affine combination of its inputs, where the weights and bias of the affine combination are tunable parameters that are calibrated during the neural network “learning” process. Activation functions such as hyperbolic tangent and sigmoid functions can be further applied in the neurons to improve the expressiveness beyond what is possible with only linear processing [24].

In the case of a supervised learning, the neural network usually contains three phases, including the training phase, validation phase and application phase. During the training phase, a data set of input features and output labels (desired output corresponding to the neural network input) are fed into the neural network to adjust the values of the tunable parameters by backpropagating the losses between the labels and the neural network output. The network’s performance is evaluated using validation data set so as to select its parameters before over-fitting such as in early stopping, or to further refine the network’s architecture. Once the neural network is fully trained, the network is capable to realize the desired functionality to generate the corresponding output for any given input features in the application phase.

The capability of a neural network to approximate a fairly large family of input-output mappings stems from a flexible design of connections between neurons and selection of activation functions associated with them. This motivates us to employ a neural network as a FAL estimator to learn the features of false alarm event during the AMP detection and the implicit correlation structure in the quantized pilot matrix.

III Proposed DL-LAMP algorithm

Non-orthogonal pilot sequences cause inter-user interference which has a detrimental impact on user identification and channel estimation in mMTC. This is further exacerbated in the scenario of low-resolution quantization of pilot sequences. To combat the interference, based on the observations and the estimated user activity, we propose to employ a neural network to find the suspicious device which is most likely to be falsely alarmed. Then, the suspicious device is enforced to be inactive in each iteration of the second round of the AMP to avoid the interference for other devices. In this section, we first describe the proposed DL-LAMP algorithm and introduce the training scheme of the neural network.

III-A Algorithm Structure

According to the list decoding technique[18], generating a list of estimates $\mathbf{\hat{x}}$ by flipping some of the unknown variables is an effective method to further enhance the estimation performance. In this paper, we generate two estimates $\mathbf{\hat{x}}_{(1)}$ and $\mathbf{\hat{x}}_{(2)}$ by executing the AMP for two rounds, as illustrated in Fig. 2. In the first round, the observations $\mathbf{y}$ are fed to the AMP-MMSE processor to obtain the first estimate $\mathbf{\hat{x}}_{(1)}$ . Based on $\mathbf{\hat{x}}_{(1)}$ and $\mathbf{y}$ , a DNN-based FAL estimator is proposed and designed for predicting the likelihood of each device to be falsely alarmed, i.e., wrongly detected as being active. Then, the index of the suspicious device $s$ can be easily acquired via comparing the FALs of all the devices. In the second round, the unknown variable associated with the suspicious device $s$ is set to be 0 in every iteration of the second round AMP-MMSE, i.e., $\hat{x}_{s}=0$ between Eqs. (3) and (4). Consequently, the suspicious device is forced to be inactive and the corresponding signal $\hat{x}_{s}$ is no longer interfering other users. Once $\mathbf{\hat{x}}_{(1)}$ and $\mathbf{\hat{x}}_{(2)}$ are both estimated by the AMP-MMSE algorithm, the final estimate $\mathbf{\hat{x}}_{(f)}$ is chosen by a least mean squared error (LMSE) selector that

[TABLE]

III-B Genie-aided List AMP

To validate the effectiveness the idea of list decoding in improving the estimation performance of the AMP-MMSE algorithm, we employ a genie-aided selector to identify the suspicious device. Note that the user activity $\mathbf{a}=\left[a_{1},a_{2},\ldots,a_{N}\right]^{{T}}$ and the signals $\mathbf{x}$ are known by the genie-aided selector. Additionally, we note that this is identical to the training phase of a neural network where $\mathbf{x}$ and $\mathbf{a}$ are given.

For a genie-aided selector, a vector of element-wise Euclidean distance between $\mathbf{\hat{x}}_{(1)}=[\hat{x}_{(1),1},\hat{x}_{(1),2},\ldots,\hat{x}_{(1),N}]^{{T}}$ and $\mathbf{x}$ is $E(\mathbf{\hat{x}}_{(1)},\mathbf{x})=[|\hat{x}_{(1),1}-x_{1}|^{2},|\hat{x}_{(1),2}-x_{2}|^{2},...,|\hat{x}_{(1),N}-x_{N}|^{2}]^{T}$ . Define $(\sim\mathbf{a})$ as the vector which flips the value of 0 to 1 and 1 to 0 in the activity vector $\mathbf{a}$ . A vector $\mathbf{B}=(\sim\mathbf{a})\bigodot E(\mathbf{\hat{x}},\mathbf{x})=[B_{1},B_{2},\ldots,B_{N}]$ characterizes the element-wise errors of the inactive devices, where $\bigodot$ denotes the point-wise multiplication. The larger the $B_{n}$ , the more likely that inactive device $n$ will be falsely alarmed. And $s=\text{argmax}(\mathbf{B})$ identifies the suspicious device who suffers the highest estimation error. Then, in the second round of the AMP-MMSE estimation, the signal $\hat{x}_{s}$ is set to be 0 in each iteration update of $\mathbf{\hat{x}}$ .

Table I shows the NMSE of AMP-MMSE and LAMP algorithm with genie-aided selector (GA-LAMP), respectively. In our simulations, the number of devices is $N=150$ and the pilot length is $M=30$ . The elements in the pilot matrix $\mathbf{P}$ is first drawn from i.i.d $\mathcal{CN}(0,\frac{1}{M})$ , then quantized into 3-bit resolution. The transmission probability is set to be $\rho=0.1$ and the SNR is 40 dB. It can be seen that for 20 iterations, the GA-LAMP provides a NMSE performance gain up to 1.32 dB compared to the conventional AMP-MMSE. This confirms the effectiveness of our proposed list AMP algorithm structure and motivates us to construct a DNN to achieve the same functionality as the genie-aided selector.

III-C DNN-based FAL Estimator

The attractiveness of employing the neural network is to investigate the features of the likelihood of the false alarmed devices in the training phase. Then based on the “learned” experience, the neural network is capable to predict the most likely false alarmed device in the application phase.

DNN Input Preprocessing:

In our model, the neural network is expected to estimate the FAL of each device based on the AMP estimates $\mathbf{\hat{x}}_{(1)}$ and the observation $\mathbf{y}$ . However, since each entry of the first round AMP output includes the channel estimates $\hat{h}_{(1),n}$ and the detected user’s activity $\hat{a}_{(1),n}$ , i.e., $\hat{x}_{(1),n}=\hat{a}_{(1),n}\hat{h}_{(1),n}$ , $\mathbf{\hat{x}}_{(1)}$ can be interpreted as soft information while $\hat{a}_{(1),n}$ is the hard decision for user identification. To facilitate the neural network training, we prefer to utilize the hard decision $\hat{a}_{(1),n}$ as its input rather than the soft information $\mathbf{\hat{x}}_{(1)}$ . In fact, the values of $\mathbf{\hat{x}}_{(1)}$ vary significantly with different realizations of noise $\mathbf{z}$ and desired unknown signals $\mathbf{x}$ , which disrupts the training phase. Therefore, we propose to transform the first round AMP estimates $\mathbf{\hat{x}}_{(1)}$ into the user activity sequence $\mathbf{\hat{a}}=[\hat{a}_{1},\hat{a}_{2},...,\hat{a}_{N}]^{T}$ by a user activity estimator as follow:

[TABLE]

which returns 1 if $|\hat{x}_{n}|>\delta$ and 0, otherwise. Its second argument $\delta$ denotes the detection threshold which is a percentile function [25] of the AMP estimates $\mathbf{\hat{x}}$ and the transmission probability $\rho$ . In particular, the percentile function $\xi\left(|\mathbf{\hat{x}}|,\rho\right)$ returns a threshold $\delta$ which leads to that only $\mathrm{Round}\left(\rho N\right)$ entries of $|\mathbf{\hat{x}}|$ are above $\delta$ . Here, $\mathrm{Round}\left(\cdot\right)$ returns the nearest integer of its input.

The threshold returned from $\xi\left(\mathbf{\hat{x}},\rho\right)$ splits all the sensors to $\mathrm{Round}\left(\rho N\right)$ active users and $N-\mathrm{Round}\left(\rho N\right)$ inactive users based on the AMP estimates $\mathbf{\hat{x}}$ . In the other words, the resulting user activity sequence $\mathbf{\hat{a}}$ has $\mathrm{Round}\left(\rho N\right)$ entries of 1 and $N-\mathrm{Round}\left(\rho N\right)$ entries of 0. As a result, the input preprocessing roughly generates user activity estimates whose distribution is Pr $(\hat{a}_{n}=1)=\rho$ and Pr $(\hat{a}_{n}=0)=1-\rho$ , $\forall n$ , which is consistent with the prior user activity distribution.

DNN Labels Preprocessing:

In the previous section, it has been shown that the vector $\mathbf{B}$ can be interpreted as the FAL of the devices. Moreover, the estimate $\mathbf{\hat{x}}$ varies significantly with different channel realizations, where the vector $\mathbf{B}$ changes correspondingly. In practice, to facilitate the training process of the neural network, we normalize each entry in the vector $\mathbf{B}$ to be within a range between 0 and 1. Towards that end, the min-max normalization is applied to the vector $\mathbf{B}$ . Define $\mathbf{e}_{\rm{FAL}}=[e_{\rm{FAL},1},e_{\rm{FAL},2},\ldots,e_{\rm{FAL},N}]^{T}$ as the labels for the neural network, which is the normalized vector of $\mathbf{B}$ , i.e.,

[TABLE]

where min( $\cdot$ ) and max( $\cdot$ ) find the minimum and maximum value of a vector, respectively. Note that, in the application phase, the trained neural network outputs a sequence of soft information $\mathbf{\hat{e}}_{\rm{FAL}}$ on FAL. This motivates us to employ the mean squared error as the loss function for training, i.e.,

[TABLE]

A backpropagation optimizer to minimize the loss function $L(\mathbf{\hat{e}}_{\rm{FAL}},\mathbf{e}_{\rm{FAL}})$ is employed to adjust the values of the tunable parameters in the neural network.

The overall DL-LAMP algorithm is summarized in Algorithm 1. In Step 2, the observations are normalized to have zero mean and unit variance before feeding to the neural network to accelerate the training process.

IV Simulation Results

In this section, we evaluate the performance of the proposed DL-LAMP algorithm, and compare it with the AMP-MMSE. In our simulations, we construct a simple DNN to demonstrate the concept of the proposed algorithm. We consider a system of 150 devices that transmit pilot sequences of length 30. The pilot matrix is chosen randomly by i.i.d $\mathcal{CN}(0,\frac{1}{M})$ , then quantized to 3-bit resolution. The fading coefficient for each device is randomly generated from i.i.d $\mathcal{CN}(0,1)$ . We employ the AMP-MMSE algorithm as proposed in [9, 23]. The neural network is constructed and trained using Tensorflow. The structure and the hyper-parameters of the DNN are listed in Table. II. Since $\mathbf{y}$ is sampled by complex values, the real and imaginary parts are formatted into corresponding two-column matrices and then simultaneously fed into the DNN.

To save the training duration, the neural network is trained with the outputs of the AMP-MMSE after the 20-th iteration, while it is tested for different numbers of iterations. For different SNR system setup, the neural network is trained individually.

The neural network is trained off-line and in the application phase, all the tunable parameters, the weights and biases in the fully connected neural layers, remained constant. In our simulation, 2 hidden layers are employed in the fully connected neural network. In total, regarding the linear computations, there are $8M^{2}+8N^{2}+16MN$ multiplications and $4M+5N$ additions in the application phase. Moreover, there are $4M+5N$ non-linear computations, where the non-linear function for each layer is stated in Table. II. Note that the computations for the neurons at each layer are in parallel, so that the processing delay in the application stage is negligible.

The simulated performance results are shown in Fig. 3, where AMP-MMSE indicates the AMP algorithm with MMSE denoiser, DL-LAMP refers to the proposed algorithm, the label “Unquantized” indicates that a pilot matrix is employed without quantization. We can observe that, at the $100$ -th iteration, the proposed DL-LAMP algorithm provides $0.8$ dB and $0.43$ dB NMSE gain compared to that at the AMP-MMSE at the SNR of $40$ dB and $15$ dB, respectively. Moreover, at SNR = $40$ dB, compared to the unquantized case, it can be seen that the proposed algorithm provides more performance gain for the quantized pilot matrix, which is more practical for IoT sensors. This relates to the fact that for a quantized pilot matrix, the pilot sequences of users are more correlated. As a result, enforcing the suspicious device as inactive provides better performance.

We further investigate the corresponding performance of the average false alarm probability $\bar{P}_{f}$ versus the average missed detection probability $\bar{P}_{m}$ , as shown in Fig. 4. In particular, the false alarm and missed detection probability are defined as:

[TABLE]

respectively, where $\mathbbm{1}\{\cdot\}$ is the indicator function. The average probabilities $\bar{P}_{f}$ and $\bar{P}_{m}$ are calculated by the mean value of $P_{f}$ and $P_{m}$ among all the realizations, respectively. The AMP iteration is preset to be 100. The false alarm probability is set between 0.1 and 0.008. It is shown in Fig. (4) that the proposed algorithm decreases the missed detection probability compared to the conventional AMP-MMSE algorithm, especially for the case with a quantized pilot matrix.

V Conclusion

This paper proposed a DL-LAMP algorithm framework to improve the user identification performance in mMTC. In particular, we construct a deep learning neural network, which serves as a FAL estimator. Owing to benefit from the flexibility of the neural network design, the proposed DNN-based FAL estimator is able to learn the features of false alarm event during the AMP detection and the implicit correlation structure in the quantized pilot matrix. Based on the FAL obtained from DNN, we identified a suspicious device which is most likely to be falsely alarmed. Enjoying the benefit of list decoding, the proposed algorithm performs two rounds of AMP estimation, where in the second round, the suspicious device is enforced to be inactive. Simulation results have shown that the proposed algorithm provides a performance gain and the corresponding missed detection probability is decreased.

Bibliography25

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] “Study on new radio (NR) access technology physical layer aspects,” 3GPP TR 38.802, Tech. Rep., 2017.
2[2] Z. Wei, L. Yang, D. W. K. Ng, J. Yuan, and L. Hanzo, “On the performance gain of NOMA over OMA in uplink communication systems,” ar Xiv preprint ar Xiv:1903.01683 , 2019.
3[3] Z. Wei, D. W. K. Ng, and J. Yuan, “NOMA for hybrid mmwave communication systems with beamwidth control,” IEEE J. Select. Topics Signal Process. , vol. 13, no. 3, pp. 567–583, Jun. 2019.
4[4] Z. Sun, L. Yang, J. Yuan, and D. W. K. Ng, “Physical-layer network coding based decoding scheme for random access,” IEEE Trans. Veh. Technol. , vol. 68, no. 4, pp. 3550–3564, Apr. 2019.
5[5] L. Liu, E. G. Larsson, W. Yu, P. Popovski, C. Stefanovic, and E. de Carvalho, “Sparse signal processing for grant-free massive connectivity: A future paradigm for random access protocols in the Internet of Things,” IEEE Signal Process. Mag. , vol. 35, no. 5, pp. 88–99, Sep. 2018.
6[6] R. Abbas, M. Shirvanimoghaddam, Y. Li, and B. Vucetic, “A novel analytical framework for massive grant-free NOMA,” IEEE Trans. Commun. , vol. 67, no. 3, pp. 2436–2449, Mar. 2019.
7[7] Z. Sun, Z. Wei, L. Yang, J. Yuan, X. Cheng, and L. Wan, “Joint user identification and channel estimation in massive connectivity with transmission control,” in Proc. IEEE Intern. Sympos. on Turbo Codes Iterative Information Process. , 2018, pp. 1–5.
8[8] Z. Sun, Z. Wei, L. Yang, J. Yuan, X. Cheng, and L. Wan, “Exploiting transmission control for joint user identification and channel estimation in massive connectivity,” IEEE Trans. Commun. , early access, 2019.