Assessment of the Relative Importance of different hyper-parameters of LSTM for an IDS
Mohit Sewak, Sanjay K. Sahay, Hemant Rathore

TL;DR
This paper investigates how hyper-parameters of LSTM networks affect malware detection accuracy using op-code sequences, highlighting the importance of tuning for machine-language data and comparing LSTM with non-sequential models.
Contribution
It identifies the most critical hyper-parameters for LSTM in malware detection and demonstrates the necessity of tuning these parameters for effective performance on op-code sequences.
Findings
Performance is highly sensitive to number of hidden layers.
Input sequence length significantly impacts detection accuracy.
Activation function choice affects model effectiveness.
Abstract
Recurrent deep learning language models like the LSTM are often used to provide advanced cyber-defense for high-value assets. The underlying assumption for using LSTM networks for malware-detection is that the op-code sequence of malware could be treated as a (spoken) language representation. There are differences between any spoken-language (sequence of words/sentences) and the machine-language (sequence of op-codes). In this paper, we demonstrate that due to these inherent differences, an LSTM model with its default configuration as tuned for a spoken-language, may not work well to detect malware (using its op-code sequence) unless the network's essential hyper-parameters are tuned appropriately. In the process, we also determine the relative importance of all the different hyper-parameters of an LSTM network as applied to malware detection using their op-code sequence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
