Anomaly Detection for Network Connection Logs
Swapneel Mehta, Prasanth Kothuri, Daniel Lanza Garcia

TL;DR
This paper presents a streaming architecture utilizing ELK, Spark, and Hadoop to analyze network connection logs in near real-time, detecting anomalies through unsupervised learning and visualization, scalable to large infrastructures.
Contribution
It introduces a novel approach for evaluating untagged, unfiltered connection logs using unsupervised learning, scalable to extensive network infrastructures.
Findings
Effective anomaly detection in large-scale logs
Visualization aids in understanding outliers
Scalable system for real-time log analysis
Abstract
We leverage a streaming architecture based on ELK, Spark and Hadoop in order to collect, store, and analyse database connection logs in near real-time. The proposed system investigates outliers using unsupervised learning; widely adopted clustering and classification algorithms for log data, highlighting the subtle variances in each model by visualisation of outliers. Arriving at a novel solution to evaluate untagged, unfiltered connection logs, we propose an approach that can be extrapolated to a generalised system of analysing connection logs across a large infrastructure comprising thousands of individual nodes and generating hundreds of lines in logs per second.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Software System Performance and Reliability
