BERTector: An Intrusion Detection Framework Constructed via Joint-dataset Learning Based on Language Model
Haoyang Hu, Xun Huang, Chenyu Wu, Shiwen Liu, Zhichao Lian, Shuangquan Zhang

TL;DR
BERTector is a novel intrusion detection framework that leverages joint-dataset learning and BERT to achieve high accuracy, robustness, and generalizability across diverse network traffic scenarios.
Contribution
It introduces a new IDS framework combining semantic tokenization, hybrid dataset fine-tuning, and low-rank adaptation for improved detection performance.
Findings
Achieves 99.28% accuracy on NSL-KDD dataset.
Reaches 80% detection success against perturbations.
Demonstrates strong generalizability and robustness.
Abstract
Intrusion detection systems (IDS) are widely used to maintain the stability of network environments, but still face restrictions in generalizability due to the heterogeneity of network traffics. In this work, we propose BERTector, a new framework of joint-dataset learning for IDS based on BERT. BERTector integrates three key components: NSS-Tokenizer for traffic-aware semantic tokenization, supervised fine-tuning with a hybrid dataset, and low-rank adaptation for efficient fine-tuning. Experiments show that BERTector achieves state-of-the-art detection accuracy, strong generalizability, and excellent robustness. BERTector achieves the highest accuracy of 99.28% on NSL-KDD and reaches the average 80% detection success rate against four perturbations. These results establish a unified and efficient solution for modern IDS in complex and dynamic network environments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Anomaly Detection Techniques and Applications · Advanced Malware Detection Techniques
