Automatic Simplification of Common Vulnerabilities and Exposures Descriptions
Varpu Vehom\"aki, Kimmo K. Kaski

TL;DR
This paper explores the use of large language models to automatically simplify complex cybersecurity vulnerability descriptions, creating a baseline and dataset for future research in this domain.
Contribution
It introduces the first study on applying automatic text simplification to cybersecurity texts, with a new dataset and baseline evaluation using expert feedback.
Findings
LLMs can make CVE descriptions appear simpler
Current models struggle with preserving original meaning
The study provides a new dataset for cybersecurity text simplification
Abstract
Understanding cyber security is increasingly important for individuals and organizations. However, a lot of information related to cyber security can be difficult to understand to those not familiar with the topic. In this study, we focus on investigating how large language models (LLMs) could be utilized in automatic text simplification (ATS) of Common Vulnerability and Exposure (CVE) descriptions. Automatic text simplification has been studied in several contexts, such as medical, scientific, and news texts, but it has not yet been studied to simplify texts in the rapidly changing and complex domain of cyber security. We created a baseline for cyber security ATS and a test dataset of 40 CVE descriptions, evaluated by two groups of cyber security experts in two survey rounds. We have found that while out-of-the box LLMs can make the text appear simpler, they struggle with meaning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Authorship Attribution and Profiling · Hate Speech and Cyberbullying Detection
