A Hybrid Deep Learning and Anomaly Detection Framework for Real-Time Malicious URL Classification

Berkani Khaled; Zeraoulia Rafik

arXiv:2512.03462·cs.CR·December 4, 2025

A Hybrid Deep Learning and Anomaly Detection Framework for Real-Time Malicious URL Classification

Berkani Khaled, Zeraoulia Rafik

PDF

Open Access

TL;DR

This paper introduces a hybrid deep learning framework that combines feature extraction, anomaly detection, and neural classification to achieve fast, accurate, and scalable real-time malicious URL detection with multilingual support.

Contribution

It presents a novel multi-stage pipeline integrating hashing, SMOTE, isolation forest, and neural networks for efficient real-time URL classification, outperforming traditional models in speed and accuracy.

Findings

01

Achieves 96.4% accuracy and 95.4% F1-score.

02

Provides a 20 ms prediction latency.

03

Outperforms CNN and SVM baselines in speed and accuracy.

Abstract

Malicious URLs remain a primary vector for phishing, malware, and cyberthreats. This study proposes a hybrid deep learning framework combining \texttt{HashingVectorizer} n-gram analysis, SMOTE balancing, Isolation Forest anomaly filtering, and a lightweight neural network classifier for real-time URL classification. The multi-stage pipeline processes URLs from open-source repositories with statistical features (length, dot count, entropy), achieving $O (N L + E B d h)$ training complexity and a 20\,ms prediction latency. Empirical evaluation yields 96.4\% accuracy, 95.4\% F1-score, and 97.3\% ROC-AUC, outperforming CNN (94.8\%) and SVM baselines with a $50 \times$ -- $100 \times$ speedup (Table~\ref{tab:comp-complexity}). A multilingual Tkinter GUI (Arabic/English/French) enables real-time threat assessment with clipboard integration. The framework demonstrates superior scalability and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Cybercrime and Law Enforcement Studies · Misinformation and Its Impacts