RoBERTa-Augmented Synthesis for Detecting Malicious API Requests

Udi Aharon; Revital Marbel; Ran Dubin; Amit Dvir; and Chen Hajaj

arXiv:2405.11258·cs.CR·May 16, 2025

RoBERTa-Augmented Synthesis for Detecting Malicious API Requests

Udi Aharon, Revital Marbel, Ran Dubin, Amit Dvir, and Chen Hajaj

PDF

Open Access

TL;DR

This paper introduces a RoBERTa-based data synthesis framework to augment limited API traffic datasets, significantly improving the accuracy of malicious API request detection models through realistic, domain-aware synthetic data generation.

Contribution

It presents a novel GAN-inspired, Transformer-based data augmentation method tailored for API security, enhancing detection performance on benchmark datasets.

Findings

01

Up to 4.94% increase in F1 score on CSIC 2010

02

Up to 21.10% increase in F1 score on ATRDF 2023

03

Improved detection robustness with synthetic data augmentation

Abstract

Web applications and APIs face constant threats from malicious actors seeking to exploit vulnerabilities for illicit gains. To defend against these threats, it is essential to have anomaly detection systems that can identify a variety of malicious behaviors. However, a significant challenge in this area is the limited availability of training data. Existing datasets often do not provide sufficient coverage of the diverse API structures, parameter formats, and usage patterns encountered in real-world scenarios. As a result, models trained on these datasets often struggle to generalize and may fail to detect less common or emerging attack vectors. To enhance detection accuracy and robustness, it is crucial to access larger and more representative datasets that capture the true variability of API traffic. To address this, we introduce a GAN-inspired learning framework that extends limited…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques · Web Application Security Vulnerabilities