Fed-urlBERT: Client-side Lightweight Federated Transformers for URL Threat Analysis
Yujie Li, Yanbin Wang, Haitao Xu, Zhenhao Guo, Fan Zhang, Ruitong Liu,, Wenrui Ma

TL;DR
Fed-urlBERT introduces a lightweight federated transformer model for URL threat detection that preserves privacy, reduces computational and bandwidth costs, and maintains high performance across diverse data scenarios.
Contribution
The paper presents Fed-urlBERT, a novel split learning-based federated transformer model tailored for URL threat analysis, balancing privacy, efficiency, and accuracy.
Findings
Achieves comparable performance to centralized models in IID and non-IID scenarios.
Reduces false positive rate by approximately 7% compared to centralized models.
Demonstrates effective mitigation of client heterogeneity through adaptive local aggregation.
Abstract
In evolving cyber landscapes, the detection of malicious URLs calls for cooperation and knowledge sharing across domains. However, collaboration is often hindered by concerns over privacy and business sensitivities. Federated learning addresses these issues by enabling multi-clients collaboration without direct data exchange. Unfortunately, if highly expressive Transformer models are used, clients may face intolerable computational burdens, and the exchange of weights could quickly deplete network bandwidth. In this paper, we propose Fed-urlBERT, a federated URL pre-trained model designed to address both privacy concerns and the need for cross-domain collaboration in cybersecurity. Fed-urlBERT leverages split learning to divide the pre-training model into client and server part, so that the client part takes up less extensive computation resources and bandwidth. Our appraoch achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Network Security and Intrusion Detection · HIV, Drug Use, Sexual Risk
