The More, the Better? A Study on Collaborative Machine Learning for DGA Detection
Arthur Drichel, Benedikt Holmes, Justus von Brandt, Ulrike Meyer

TL;DR
This study investigates how collaborative machine learning improves DGA detection, showing significant FPR reduction and discussing privacy concerns across various classifiers and real-world scenarios.
Contribution
It provides a comprehensive evaluation of 13,440 runs on collaborative ML for DGA detection, highlighting its benefits and limitations in real-world settings.
Findings
Collaborative ML reduces FPR by up to 51.7%.
Not all approaches and classifiers benefit equally.
Privacy threats are significant in collaborative ML scenarios.
Abstract
Domain generation algorithms (DGAs) prevent the connection between a botnet and its master from being blocked by generating a large number of domain names. Promising single-data-source approaches have been proposed for separating benign from DGA-generated domains. Collaborative machine learning (ML) can be used in order to enhance a classifier's detection rate, reduce its false positive rate (FPR), and to improve the classifier's generalization capability to different networks. In this paper, we complement the research area of DGA detection by conducting a comprehensive collaborative learning study, including a total of 13,440 evaluation runs. In two real-world scenarios we evaluate a total of eleven different variations of collaborative learning using three different state-of-the-art classifiers. We show that collaborative ML can lead to a reduction in FPR by up to 51.7%. However,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
