On the Scalability of Big Data Cyber Security Analytics Systems
Faheem Ullah, Muhammad Ali Babar

TL;DR
This paper investigates the scalability challenges of Spark-based Big Data Cyber Security Analytics systems, identifies key configuration parameters affecting performance, and proposes an adaptive approach, SCALER, to optimize scalability based on experimental results.
Contribution
It provides the first detailed analysis of Spark configuration impacts on BDCA system scalability and introduces SCALER, a parameter-driven adaptation method to enhance scalability.
Findings
Default Spark settings cause significant scalability deviation.
Most Spark parameters significantly influence scalability.
SCALER improves scalability by over 20%.
Abstract
Big Data Cyber Security Analytics (BDCA) systems use big data technologies (e.g., Apache Spark) to collect, store, and analyze a large volume of security event data for detecting cyber-attacks. The volume of digital data in general and security event data in specific is increasing exponentially. The velocity with which the security event data is generated and fed into a BDCA system is unpredictable. Therefore, a BDCA system should be highly scalable to deal with the unpredictable increase/decrease in the velocity of security event data. However, there has been little effort to investigate the scalability of BDCA systems to identify and exploit the sources of scalability improvement. In this paper, we first investigate the scalability of a Spark-based BDCA system with default Spark settings. we then identify Spark configuration parameters (e.g., execution memory) that can significantly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Cloud Computing and Resource Management · Network Security and Intrusion Detection
