HTTP2vec: Embedding of HTTP Requests for Detection of Anomalous Traffic
Mateusz Gniewkowski, Henryk Maciejewski, Tomasz R. Surmacz, Wiktor, Walentynowicz

TL;DR
This paper introduces HTTP2vec, an unsupervised embedding approach using RoBERTa to detect anomalous HTTP traffic, improving interpretability and performance over existing methods.
Contribution
The paper applies a novel NLP-based embedding technique with RoBERTa for HTTP anomaly detection, emphasizing interpretability and real-world applicability.
Findings
Comparable or better detection accuracy than existing methods
Effective clustering of HTTP requests in embedding space
Model trained solely on legitimate traffic for anomaly detection
Abstract
Hypertext transfer protocol (HTTP) is one of the most widely used protocols on the Internet. As a consequence, most attacks (i.e., SQL injection, XSS) use HTTP as the transport mechanism. Therefore, it is crucial to develop an intelligent solution that would allow to effectively detect and filter out anomalies in HTTP traffic. Currently, most of the anomaly detection systems are either rule-based or trained using manually selected features. We propose utilizing modern unsupervised language representation model for embedding HTTP requests and then using it to classify anomalies in the traffic. The solution is motivated by methods used in Natural Language Processing (NLP) such as Doc2Vec which could potentially capture the true understanding of HTTP messages, and therefore improve the efficiency of Intrusion Detection System. In our work, we not only aim at generating a suitable embedding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Spam and Phishing Detection
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Softmax · Multi-Head Attention · Layer Normalization · WordPiece · Dropout · Dense Connections · Adam
