Identifying Malicious Web Domains Using Machine Learning Techniques with   Online Credibility and Performance Data

Zhongyi Hu; Raymond Chiong; Ilung Pranata; Willy Susilo; Yukun Bao

arXiv:1902.08792·cs.CR·February 26, 2019

Identifying Malicious Web Domains Using Machine Learning Techniques with Online Credibility and Performance Data

Zhongyi Hu, Raymond Chiong, Ilung Pranata, Willy Susilo, Yukun Bao

PDF

TL;DR

This paper explores the effectiveness of machine learning techniques combined with online domain data to accurately identify malicious web domains, demonstrating improved performance with feature selection methods.

Contribution

It introduces the use of online domain popularity and performance data with machine learning classifiers and applies BPSO for feature selection to enhance detection accuracy.

Findings

01

Machine learning techniques can effectively identify malicious domains using online data.

02

BPSO-based feature selection improves classifier performance.

03

Ensemble classifiers outperform single classifiers in detection accuracy.

Abstract

Malicious web domains represent a big threat to web users' privacy and security. With so much freely available data on the Internet about web domains' popularity and performance, this study investigated the performance of well-known machine learning techniques used in conjunction with this type of online data to identify malicious web domains. Two datasets consisting of malware and phishing domains were collected to build and evaluate the machine learning classifiers. Five single classifiers and four ensemble classifiers were applied to distinguish malicious domains from benign ones. In addition, a binary particle swarm optimisation (BPSO) based feature selection method was used to improve the performance of single classifiers. Experimental results show that, based on the web domains' popularity and performance data features, the examined machine learning techniques can accurately…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.