Wireless Interference Identification with Convolutional Neural Networks
Malte Schmidt, Dimitri Block, Uwe Meier

TL;DR
This paper introduces a novel deep convolutional neural network-based approach for wireless interference identification, achieving high accuracy in classifying signals within the 2.4 GHz ISM band, which enhances coexistence management.
Contribution
It presents the first CNN-based method for wireless interference identification, demonstrating superior performance over existing techniques with a data-driven training process.
Findings
CNN achieves over 95% accuracy at -5 dB SNR
Method distinguishes 15 classes of wireless signals
Outperforms state-of-the-art interference identification approaches
Abstract
The steadily growing use of license-free frequency bands requires reliable coexistence management for deterministic medium utilization. For interference mitigation, proper wireless interference identification (WII) is essential. In this work we propose the first WII approach based upon deep convolutional neural networks (CNNs). The CNN naively learns its features through self-optimization during an extensive data-driven GPU-based training process. We propose a CNN example which is based upon sensing snapshots with a limited duration of 12.8 {\mu}s and an acquisition bandwidth of 10 MHz. The CNN differs between 15 classes. They represent packet transmissions of IEEE 802.11 b/g, IEEE 802.15.4 and IEEE 802.15.1 with overlapping frequency channels within the 2.4 GHz ISM band. We show that the CNN outperforms state-of-the-art WII approaches and has a classification accuracy greater than 95%…
| Wireless technology | WT | IEEE 802.15.1 | IEEE 802.15.4 | IEEE 802.11 |
|---|---|---|---|---|
| Absolute channel numbers | ||||
| Relative channel numbers | ||||
| Absolute offset | ||||
| Relative offset |
| Layer type | Input size | Parameters | Activation fct. |
| Convolutional | filter kernel | Rectified linear | |
| layer | feature maps | ||
| Convolutional | filter kernel | Rectified linear | |
| layer | feature maps | ||
| Dropout | |||
| Dense layer | neurons | Rectified linear | |
| Dropout | |||
| Dense layer | neurons | Softmax |
| Layer type | Input size | Parameters | Activation fct. |
| Convolutional | filter kernel | Rectified linear | |
| layer | feature maps | ||
| Convolutional | filter kernel | Rectified linear | |
| layer | feature maps | ||
| Dropout | |||
| Dense layer | neurons | Rectified linear | |
| Dropout | |||
| Dense layer | neurons | Softmax |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Wireless Interference Identification with
Convolutional Neural Networks
Malte Schmidt, Dimitri Block, Uwe Meier
inIT - Institute Industrial IT
Ostwestfalen-Lippe University of Applied Sciences
Lemgo, Germany
Email: [email protected], [email protected], [email protected]
Abstract
The steadily growing use of license-free frequency bands requires reliable coexistence management for deterministic medium utilization. For interference mitigation, proper wireless interference identification (WII) is essential.
In this work we propose the first WII approach based upon deep convolutional neural networks. The CNN naively learns its features through self-optimization during an extensive data-driven GPU-based training process. We propose a CNN example which is based upon sensing snapshots with a limited duration of 12.8 µs and an acquisition bandwidth of 10 MHz. The CNN differs between 15 classes. They represent packet transmissions of IEEE 802.11 b/g, IEEE 802.15.4 and IEEE 802.15.1 with overlapping frequency channels within the 2.4 GHz ISM band. We show that the CNN outperforms state-of-the-art WII approaches and has a classification accuracy greater than 95 % for signal-to-noise ratio of at least -5 dB.
I Introduction
Artificial neural networks and especially CNNs achieved excellent results for different benchmarks in recent years [1, 2, 3]. Neural networks achieve the best performance e.g. for character recognition of the Mixed National Institute of Standards and Technology (MNIST) database [4]. The results achieved by the CNNs from Cireşan et al. [1] are comparable to human performance. Inspired by these results and the progress in deep learning, CNNs are used as classifier in a growing number of research fields.
One of these research fields is WII for coexistence management of license-free frequency bands such as the 2.4 GHz ISM band. Such bands are shared between incompatible heterogeneous wireless communication systems. In industrial environments, typically standardized wireless communication systems within the 2.4 GHz ISM band are wide-band high-rate IEEE 802.11 b/g/n, narrow-band low-rate IEEE 802.15.4-based WirelessHART and ISA 100.11a, and IEEE 802.15.1-related PNO WSAN-FA and Bluetooth. Additionally, the spectrum band is shared with many proprietary wireless technologies which target specific application requirements such as the IEEE 802.11-based industrial WLAN (iWLAN) from Siemens AG, FHSS-based Trusted Wireless from Phoenix Contact and IEEE 802.15.1-based WISA from ABB Group.
Any radio interference can cause packet loss and transmission latency for industrial radio communication systems. Both effects have to be mitigated for real-time medium requirements. Therefore, the norm IEC 62657-2 [5] for industrial radio communication systems recommends an active coexistence management for reliable medium utilization. [5] recommends (i) manual, (ii) automatic non-cooperative or (iii) automatic cooperative coexistence management. The first approach is the most in-efficient one, due to time-consuming complex configuration effort. The automatic approaches (ii) and (iii) enable efficient self-reconfiguration without manual intervention and radio-specific expertise. An automatic cooperative coexistence management (iii) requires a control channel, i.e. a logical common communication connection between each coexisting wireless system to enable deterministic medium access. In case of a single legacy coexisting wireless system without such connection, the non-cooperative approach (ii) is recommended. Non-cooperative coexistence management approaches are aware of coexisting wireless systems based on independent WII and mitigation.
In this paper, we propose the first WII approach based upon deep CNNs. In order to face realistic wireless device capabilities the approach is limited to a sensing bandwidth of 10 MHz and a sensing snapshot is limited to 128 IQ Samples with the duration of 12.8 µs. The evaluation is performed with standardized wireless communication systems based upon IEEE 802.11 b/g, IEEE 802.15.4 and IEEE 802.15.1, which are sharing the 2.4 GHz ISM band. In total 19 different variants of modulation types and symbol rates are utilized. Thereby, the WII approach has to differ between 15 classes which represent the allocated frequency channel and the wireless technology.
First of all in chapter II we will present and discuss three publications related to our work: Compressed Sensing, the neuro-fuzzy signal classifier and Convolutional Modulation Recognition. Then, in chapter III, we will explain how we generated our data set and discuss our CNN design in chapter IV. Thirdly in chapter V we will evaluate the performance of our CNNs and compare it to the performance of the neuro-fuzzy signal classifier (NFSC). Finally in chapter VI we will suggest future work.
II Related Work
II-A Compressed Sensing
Compressed sensing is utilized for sub-sampled signal reconstruction. In [6, 7] it is also applied for classification of frequency bands into white, gray and black sections. Such separation depends on the spectral power density and shows in which sections of the frequency band a transmission without interference is likely.
This method tries to reduce the run-time of the decision process as much as possible. Therefore, the computation of the power density spectrum (PSD) is done with a compressed vector of input samples for the trade-off of accuracy. The goal of this method is not to exactly classify which radio signals are present but to find spectral spaces which are sparely used. For this task a wavelet-based edge detector is used.
In contrast to our approach the compressed sensing promises faster run-times and can make use of samples recorded with a sample rate lower than the Nyquist rate. This means lower hardware requirements. On the other hand this leads to an information loss which might be helpful for decision making.
II-B Neuro-Fuzzy Signal Classifier
The NFSC [8] is an expert system which classifies frequency bands with respect to known wireless technologies. The purpose of the NFSC is similar to the proposed CNN. The performance of the two classifiers will be compared later on.
The NFSC is separated into six different layers. The input of the first layer, the input layer, are the IQ samples of the signal. This layer computes the PSD of the input.
In the second layer, the fuzzification layer, the PSD is normalized to
[TABLE]
with the PSD in dBm.
In the third layer, the filtering layer, the signal is filtered with a predefined filter shape. In the simplest case the filter shape is a rectangle that is greater than the channel bandwidth of the radio signal that has to be classified.
After the filtering layer the similarity layer computes the similarity of the filtered signal with a predefined reference shape. After comparing the similarity to a threshold the radio signal is classified. Given the reference shape and the fuzzificated, filtered signal the similarity is given by
[TABLE]
Thus it is possible to define many different filter and reference shapes to classify different radio signals at the same time.
The fifth layer, the statistics layer aggregates consecutive results for temporal evaluation. This information is handed over to the sixth layer, the interference layer, which decides if the radio signal belongs to a frequency-hopping system.
In this paper the NFSC up to the fourth layer is compared to the CNN because the CNN has only a very short measurement period and temporal statistics must be collected by a following processing unit.
The purpose of the NFSC is exactly the same as the CNN presented in this paper: to classify radio signals with respect to known standards. The approach however is quite different. While the NFSC relies on pre-defined features and a fixed decision process the CNN trains its feature extraction and decision process during a learning process. On the one hand this leads to a more flexible decision making. On the other hand it becomes more difficult to analyze the decision making process.
II-C Convolutional Modulation Recognition
O’Shea et al. [9] compared the performance of different classifiers for classifying 11 modulations, 8 digital and 3 analog modulations. The classifiers used were CNNs with different parameters, deep neural networks with different parameters, a decision tree, a naive bayes classifier, a k-nearest-neighbor classifier and a support vector machine. For this task a CNN achieved the best results for low signal-to-noise ratios. For high SNRs the difference of all classifiers were comparable except the naive bayes and a deep neural network with high regularization.
O’Shea et al. investigated the complex-valued temporal radio signal domain whereas in this paper the spectral radio signal domain is investigated. Nevertheless there are many parallels in these domains and we adapted the CNN from O’Shea et al. that achieved the best results as a starting point for our work. This also shows how flexible such self-learning classifiers can be used. The network structure of this CNN was inspired by networks for the visual domain like the MNIST data set.
For data set generation O’Shea et al. used GNU Radio [10] a toolkit for software defined radios. The CNN was trained on approximately 96,000 snapshots each consisting of 128 IQ samples. These IQ samples were given as a matrix as the input for the CNN. One column of the matrix consisted of the I-samples and the other column of the Q-samples. No information about the link between the I- and Q-samples were given to the network. The snapshots were snapshots of the time domain. The data set is available at [11].
III Generation of Data Set
The training and validation data that was used was generated with the vector signal generator (VSG) SMBV100A from RHODE & SCHWARZ and was recorded with the real time spectrum analyzer (RSA) RSA6114A from TEKTRONIX.
The measurement was triggered. Only data was recorded in which a radio signal was transmitted. For example the inter-frame gaps of the signals were not recorded. Every package that was sent by the VSG had the maximum allowed number of bytes as payload. The payload was chosen at random.
For the IEEE 802.11 b/g frames the Physical Layer Mode was varied between CCK, PBCC and OFDM and every allowed bit rate for each mode was used. For the IEEE 802.15.1 frames the Transport Mode was varied between ACL, eSCO and SCO and different packet types were used. For the IEEE-802.15.4 frames the ACK-frame was used.
In the presented work some restrictions concerning the training and validation data were made:
- •
Single-label classification
- •
Flat fading channel model
- •
Thermal noise reception distortions
A single-label problem is considered, so signals of exactly one class are present in each input sensing snapshot for the CNN. This means no concurrent signals were allowed. Moreover for the generating of the data only one emitter, the VSG, was used.
Further, a flat fading channel was utilized due to the connection of the VSG and RSA via a coaxial cable. These restrictions were made to keep the problem simple to get a first prototype running.
The third restriction is an assumption of thermal noise reception distortions with additive white Gaussian noise in the SNR range of -20 dB until 20 dB with the step size of 2 dB. It was added with a SIMULINK [12] model in post-processing.
In total 151,200 sensing snapshots were used for training and 74,025 for validation.
IV Convolutional Neural Network Design
The CNNs design targets radio signal classification. The radio signals are complaint to IEEE 802.11 b/g, IEEE 802.15.4 and IEEE 802.15.1 packet transmissions. Thereby, the CNN shall classify the allocated frequency channel and the corresponding wireless technology.
In order to face realistic wireless device capabilities the CNN is limited to a sensing bandwidth of 10 MHz. Hence, to observe the whole 2.4 GHz-ISM-Band eight parallel classifiers are required. Therefore, a technology-specific relative channel number can be mapped to its absolute channel number with the index of the utilized CNN:
[TABLE]
with technology-specific absolute and relative channel offsets and , respectively. The offset values and the channel number sets are listed in Tab. I. The limited sensing bandwidth comprises ten, two, and three frequency channels of IEEE 802.15.1, IEEE 802.15.4, and IEEE 802.11 b/g complaint signals, respectively. In total, different classes have to be distinguished.
Fig. 1 illustrates the classes which represent frequency channels of the corresponding wireless technologies in case the third CNN with the center frequency of 2426.5 MHz is utilized. It is important to mention, that the frequency channels of the selected IEEE 802.15.1 and IEEE 802.15.4 complaint signals are within sensing bandwidth while the signals complaint to IEEE 802.11 b/g are only partly within sensing bandwidth.
The sensing bandwidth is exemplary and other bandwidths are possible. A greater sensing bandwidth with a fixed sensing snapshot duration increase the input data size proportionally. Additionally, the number of observable frequency channels and therefore distinguishable classes increases. Therefore, the CNNs requires more neurons in at least the first and last layer, which increases the computation complexity and also the required number of IQ-samples for a sensing snapshot.
IV-A Frequency-Domain Sensing Snapshots
The sensing snapshot is limited to a duration of 12.8 µs and therefore consists of 128 IQ-Samples. It has to be greater than the symbol durations of the utilized wireless technologies for sense-making classification. With the minimal applied symbol duration of 1 µs, a single sensing snapshot contains up to 12.8 symbols.
Danev et al. [13] showed for emission-based wireless device identification that frequency-based features outperform their time-based equivalents. Therefore, the IQ-Samples are transfered into frequency-domain by fast fourier transform (FFT). The resulting snapshot contains 128 complex-valued frequency bins.
IV-B Network Structure
The network structure of the CNN is derived from O’Shea et al. [9] as listed in Tab. II.
Therefore, the input data is a sensing snapshot with complex-valued frequency bins, whereby the real and imaginary parts are considered as independent floating point values. So, the input data results as matrix.
The output data size is a vector of the length 15 as there are 15 classes to classify. Each entry of the vector has a value between 0 and 1 and describes how likely it is that the input data belongs to the class it stands for in relation to the other classes.
IV-C Network Training
For the training process the Adam optimizer [14] was used and the input data was normalized. This optimizer showed best results in the work of O’Shea et al.. The default parameters for the Adam optimizer except the learning rate were used. The learning rate for the CNN was . The CNN was trained for epochs. This setup showed best results among small parameter variations. No hyperparameter optimization was performed but could be done in the future to optimize the results. A batch size of was used for training which was near the limit of the graphics card memory.
IV-D Network Size Reduction
As a rule of thumb it is often assumed that the degrees of freedom of the CNN should be less than the number of sensing snapshots for training. In practice it is difficult to apply to this rule because it is e.g. not applicable if you also have to determine the number of sensing snapshots you will use for training.
As the size of the data set was determined by orientating to the work of O’Shea et al. [9] we then reduced the size of the CNN so that it had approximately the same degrees of freedom as sensing snapshots were used for training. The idea of this rule is that the CNN learns to extract and recognize more reliable features in the sensing snapshots and does not try to learn random processes like noise in the data. This shall lead to a better generalization of the CNN. The network structure of the reduced CNN is listed in Tab. III.
The learning rate of the reduced CNN was and it was trained for epochs. A batch size of was used.
IV-E Implementation
The CNN was implemented, trained and validated on high end computation platform with the central processing unit (CPU) Intel XEON E5-1660 v3, 16 GBit RAM and the graphics processing unit (GPU) Nvidia GTX 960. The CNN implementation utilizes the abstract modular deep learning library Keras [15] and the computation library Theano [16] as its backend.
Performing CNN’s training on the GPU results in a duration of approximately 74 s per epoch and therefore 501 ms per batch. For the reduced CNN the duration cuts down to approximately 3 s per epoch and therefore cuts down to 20 ms per batch.
V Results
The accuracy of the CNN for the validation data is shown in Fig. 2. For the IEEE-802.11 b/g channels the accuracy is the worst. The IEEE-802.11 b/g signals have the biggest channel width and the most different modulations and bit rates. Therefore they are the most complex signals for the investigated classification problem.
The best accuracy is achieved for the different IEEE-802.15.1 channels. The four classes and have a slightly worse accuracy then the other IEEE-802.15.1 classes. The channels at the borders of the observed bandwidth, the classes and , are deformed by the anti-aliasing filter and the channel bandwidth of the class overlaps with the channel bandwidth of the class . The classes and have the same center frequency as the classes and . This causes a slightly worse recognition rate by the CNN because there are less features to distinguish between these classes.
The accuracy for all SNRs is clearly better than the accuracy achieved by the CNN for modulation recognition used by O’Shea et al. [9]. There are two possible reasons for this result. The first possible reason is that the training and validation data was generated too synthetically and does not represent the real world effects well enough. The other possible reason which seems to be more likely is that a mainly frequency selective classification is much easier for a CNN to classify. Another hint which leads to this conclusion is that the CNN could be reduced significantly without a big loss of accuracy.
The significant reduction of the CNN points out that a frequency selective classification problem is less complex than a modulation selective classification which was investigated by O’Shea et al. [9]. Another proof for this assumption is the slightly worse recognition rate of the IEEE 802.15.1 signals which have the same center frequency as the IEEE 802.15.4 signals. If all signals that have to be classified use the same center frequency the accuracy will likely get worse as it is the case in the work of O’Shea et al..
V-A Comparison of CNN and NFSC
The accuracy of the CNN was compared to the accuracy of the NFSC. Therefore, the filter and reference shape of the NFSC rectangles were used. The filter shape was twice as wide as the reference shape which had the channel bandwidth as width. Further, a threshold of for the similarity was used. For fair comparison, the classification of IEEE-802.11 b/g complaint signals were ignored due to suboptimal performance for out-of-band signal identification.
The performance under such constraints of the NFSC in comparison with CNN is shown in Fig. 3. It is important to note, that the resulting classification accuracies are averaged for all utilized classes. For the applied validation data, the CNN outperforms the NFSC in terms of classification accuracy independent of SNRs. The CNN shows an average performance gain and classification accuracy improvement of at least 5.32 dB and 8.19 %, respectively. Hence, the data-driven CNN approach is limited to sensing snapshots similar to the trained data, while the NFSC can be also utilized for real world scenarios. Nevertheless, CNN’s promising results should be investigated further.
V-B Frequency- and Time-Domain Sensing Snapshots
The CNN can either be trained with sensing snapshots in time- or in frequency domain. Hence, the training was also performed with same-sized time-equivalent with quadrature and imaginary components of the raw 128 IQ-Samples as input matrix.
The classification accuracy with both frequency- and time-Domain sensing snapshots as input are shown in Fig. 4 As expected the frequency-domain sensing snapshots outperform the time-equivalent because foremost the CNNs’s differentiation has to be frequency selective.
V-C Reduced CNN
The reduced CNN has over reduced degrees of freedom compared to the big CNN and is therefore much faster. The accuracy of the training and validation data of both CNNs is shown in Fig. 5.
Although the degrees of freedom for the small CNN were reduced significantly the accuracy of the validation data is similar. While for training the classification accuracy drops for the reduced CNN especially for low SNRs values, for validation the performance of both CNNs is comparable. The training performance difference shows that the original CNN memorizes the sensing snapshot pattern including noise while the reduced CNN generalizes much better.
VI Conclusion
The steadily growing use of license-free frequency bands requires reliable coexistence management and therefore proper WII.
In this work we propose the first WII approach based upon deep CNN. The CNN naively learns its features through self-optimization during an extensive data-driven training process. The design of the CNN was derived from the network of O’Shea et al. [9] for the related field of modulation recognition. Further, a network size reduction was performed by 99 % of the total degrees of freedom. During the training phase it showed better generalization.
In order to face realistic wireless device capabilities the CNN identifies time- and frequency-limited sensing snapshots with a duration of 12.8 µs and acquisition bandwidth of 10 MHz. Thereby, it differs between 15 classes, which represent allocated frequency channels of IEEE 802.15.4, IEEE 802.15.1, and partly in-band IEEE 802.11 b/g compliant packet transmissions. In total 151,200 sensing snapshots were used for training and 74,025 for validation containing 19 different variants of modulation types and symbol rates.
For implementation the Python library Keras in combination with Theano was utilized with a high level of abstraction. The GPU-based training speeds up the training duration to approximately 501 ms for the original and 20 ms for the reduced CNN per batch size of 1024.
The proposed CNN shows promising results with high classification accuracy for signals with low SNR. In average, the accuracy exceeds 95 % for SNRs greater than -5 dB. The performance drops with wideband signals which are clipped by the limited acquisition bandwidth such as IEEE 802.11 b/g compliant packet transmissions. Secondly, it also shows minimal performance issues with transmission signals sharing the same center frequency such as the fourth IEEE 802.15.1 and first IEEE 802.15.4 channel. Nevertheless, the CNN outperforms state-of-the-art WII approaches such as NFSC [8] under similar constraints. Thereby, the CNN approach shows an average performance gain and classification accuracy improvement of at least 5.32 dB and 8.19 %, respectively.
Nevertheless, the prototype has to be enhanced and must be validated in the field to become a viable option as a classifier for coexistence management. Such an enhancement is a design of a CNN suitable for multi-label WII of concurrent transmissions in the same frequency range. Secondly, the training data has to be extended and diversified to get a better representation of channel and hardware impairments for the training data.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] D. C. Cireşan, U. Meier, and J. Schmidhuber, “Multi-column deep neural networks for image classification,” Co RR , vol. abs/1202.2745, 2012.
- 2[2] A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems 25 , F. Pereira, Burges, C. J. C., L. Bottou, and K. Q. Weinberger, Eds. Curran Associates, Inc, 2012, pp. 1097–1105. [Online]. Available: www.papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
- 3[3] M. Lin, Q. Chen, and S. Yan, “Network in network,” Co RR , vol. abs/1312.4400, 2013.
- 4[4] Y. Le Cun, C. Cortes, and J. Burges, “Mnist handwritten digit database.” [Online]. Available: www.yann.lecun.com/exdb/mnist/
- 5[5] IEC, “Industrial communication networks – wireless communication networks – part 2: Coexistence management,” 2013.
- 6[6] Z. Tian and G. B. Giannakis, “Compressed sensing for wideband cognitive radios,” in 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP ’07 , 2007, pp. 1357–1360.
- 7[7] D. Wieruch, P. Jung, T. Wirth, A. Dekorsy, and T. Haustein, “Cognitive radios exploiting gray spaces via compressed sensing,” Frequenz , vol. 70, no. 7-8, pp. 289–300, 2016.
- 8[8] K. Ahmad, G. Shresta, U. Meier, and H. Kwasnicka, “Neuro-fuzzy signal classifier (NFSC) for standard wireless technologies,” in 2010 7th International Symposium on Wireless Communication Systems , Sept 2010, pp. 616–620.
