How to Configure Masked Event Anomaly Detection on Software Logs?
Jesse Nyyss\"ol\"a, Mika M\"antyl\"a, Mart\'in Varela

TL;DR
This paper establishes a baseline configuration for masked event anomaly detection in software logs, comparing NLP and deep learning models across multiple datasets to guide future research.
Contribution
It provides a systematic analysis of parameter settings and model performance, offering a generalizable baseline for masked event prediction in log anomaly detection.
Findings
DL models maintain performance with larger window sizes.
N-Gram model performance declines with larger windows on some datasets.
Predicting events away from sequence edges yields better results.
Abstract
Software Log anomaly event detection with masked event prediction has various technical approaches with countless configurations and parameters. Our objective is to provide a baseline of settings for similar studies in the future. The models we use are the N-Gram model, which is a classic approach in the field of natural language processing (NLP), and two deep learning (DL) models long short-term memory (LSTM) and convolutional neural network (CNN). For datasets we used four datasets Profilence, BlueGene/L (BGL), Hadoop Distributed File System (HDFS) and Hadoop. Other settings are the size of the sliding window which determines how many surrounding events we are using to predict a given event, mask position (the position within the window we are predicting), the usage of only unique sequences, and the portion of data that is used for training. The results show clear indications of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Software Engineering Techniques and Practices
