Data-Driven Characterization and Detection of COVID-19 Themed Malicious Websites
Mir Mehedi Ahsan Pritom, Kristin M. Schweitzer, Raymond M. Bateman,, Min Xu, Shouhuai Xu

TL;DR
This paper presents a data-driven approach to characterize and detect COVID-19 themed malicious websites, revealing attacker strategies and achieving high detection accuracy using machine learning.
Contribution
It introduces a novel dataset and features for detecting COVID-19 related malicious websites, demonstrating the effectiveness of Random Forest classifiers.
Findings
Attackers craft geolocation targeted COVID-19 websites.
Random Forest achieves 98% accuracy in detection.
High false-positive rate of 2.7%.
Abstract
COVID-19 has hit hard on the global community, and organizations are working diligently to cope with the new norm of "work from home". However, the volume of remote work is unprecedented and creates opportunities for cyber attackers to penetrate home computers. Attackers have been leveraging websites with COVID-19 related names, dubbed COVID-19 themed malicious websites. These websites mostly contain false information, fake forms, fraudulent payments, scams, or malicious payloads to steal sensitive information or infect victims' computers. In this paper, we present a data-driven study on characterizing and detecting COVID-19 themed malicious websites. Our characterization study shows that attackers are agile and are deceptively crafty in designing geolocation targeted websites, often leveraging popular domain registrars and top-level domains. Our detection study shows that the Random…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
