Automated Identification of Security-Relevant Configuration Settings   Using NLP

Patrick St\"ockle; Theresa Wasserer; Bernd Grobauer; Alexander; Pretschner

arXiv:2209.08853·cs.CR·August 3, 2023

Automated Identification of Security-Relevant Configuration Settings Using NLP

Patrick St\"ockle, Theresa Wasserer, Bernd Grobauer, Alexander, Pretschner

PDF

1 Repo

TL;DR

This paper explores using NLP techniques to classify security-relevant configuration settings, aiming to assist security experts in identifying critical settings more efficiently.

Contribution

It introduces a machine learning approach with labeled datasets and code to aid security experts, though it does not fully replace human judgment.

Findings

01

Classifiers assist but do not replace experts

02

Labeled datasets and code are publicly available

03

Further research needed to improve accuracy

Abstract

To secure computer infrastructure, we need to configure all security-relevant settings. We need security experts to identify security-relevant settings, but this process is time-consuming and expensive. Our proposed solution uses state-of-the-art natural language processing to classify settings as security-relevant based on their description. Our evaluation shows that our trained classifiers do not perform well enough to replace the human security experts but can help them classify the settings. By publishing our labeled data sets and the code of our trained model, we want to help security experts analyze configuration settings and enable further research in this area.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tum-i4/automated-identification-of-security-relevant-configuration-settings-using-nlp
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.