Deep Learning based Frameworks for Handling Imbalance in DGA, Email, and URL Data Analysis
Simran K, Prathiksha Balakrishna, Vinayakumar Ravi, Soman KP

TL;DR
This paper introduces cost-sensitive deep learning frameworks to effectively handle class imbalance in cybersecurity data such as DGA, email, and URL analysis, demonstrating improved performance over traditional methods.
Contribution
It proposes novel cost-sensitive deep learning frameworks and evaluates their effectiveness across multiple cybersecurity use cases, highlighting their advantages over cost-insensitive approaches.
Findings
Cost-sensitive methods outperform cost-insensitive ones in all experiments.
Hyperparameter tuning optimizes the performance of the frameworks.
Cost-sensitive learning improves class representation for minority classes.
Abstract
Deep learning is a state of the art method for a lot of applications. The main issue is that most of the real-time data is highly imbalanced in nature. In order to avoid bias in training, cost-sensitive approach can be used. In this paper, we propose cost-sensitive deep learning based frameworks and the performance of the frameworks is evaluated on three different Cyber Security use cases which are Domain Generation Algorithm (DGA), Electronic mail (Email), and Uniform Resource Locator (URL). Various experiments were performed using cost-insensitive as well as cost-sensitive methods and parameters for both of these methods are set based on hyperparameter tuning. In all experiments, the cost-sensitive deep learning methods performed better than the cost-insensitive approaches. This is mainly due to the reason that cost-sensitive approach gives importance to the classes which have a very…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Spam and Phishing Detection · Internet Traffic Analysis and Secure E-voting
