Can We Generate Shellcodes via Natural Language? An Empirical Study

Pietro Liguori; Erfan Al-Hossami; Domenico Cotroneo; Roberto Natella,; Bojan Cukic; Samira Shaikh

arXiv:2202.03755·cs.SE·March 9, 2022

Can We Generate Shellcodes via Natural Language? An Empirical Study

Pietro Liguori, Erfan Al-Hossami, Domenico Cotroneo, Roberto Natella,, Bojan Cukic, Samira Shaikh

PDF

1 Repo

TL;DR

This paper explores the feasibility of automatically generating shellcodes from natural language descriptions using neural machine translation, demonstrating high accuracy in an empirical study with a novel dataset.

Contribution

It introduces a neural machine translation approach for shellcode generation from natural language and provides a new dataset and metrics for evaluation.

Findings

01

NMT can generate shellcodes with high accuracy

02

Many shellcodes are generated with no errors

03

Proposes novel evaluation metrics for this task

Abstract

Writing software exploits is an important practice for offensive security analysts to investigate and prevent attacks. In particular, shellcodes are especially time-consuming and a technical challenge, as they are written in assembly language. In this work, we address the task of automatically generating shellcodes, starting purely from descriptions in natural language, by proposing an approach based on Neural Machine Translation (NMT). We then present an empirical study using a novel dataset (Shellcode_IA32), which consists of 3,200 assembly code snippets of real Linux/x86 shellcodes from public databases, annotated using natural language. Moreover, we propose novel metrics to evaluate the accuracy of NMT at generating shellcodes. The empirical analysis shows that NMT can generate assembly code snippets from the natural language with high accuracy and that in many cases can generate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dessertlab/Shellcode_IA32
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.