Semantic Similarity-Based Clustering of Findings From Security Testing Tools
Phillip Schneider, Markus Voggenreiter, Abdullah Gulraiz, Florian, Matthes

TL;DR
This paper explores using NLP techniques to cluster semantically similar security findings from testing tools, aiming to reduce manual effort and improve duplicate detection in security reports within DevOps practices.
Contribution
It introduces a novel web application for report annotation, provides a human-annotated corpus, and compares various semantic similarity methods for clustering security findings.
Findings
Semantic similarity techniques can effectively group duplicate security findings.
The annotated corpus supports future research in automated security report analysis.
Quantitative and qualitative evaluations demonstrate promising clustering performance.
Abstract
Over the last years, software development in domains with high security demands transitioned from traditional methodologies to uniting modern approaches from software development and operations (DevOps). Key principles of DevOps gained more importance and are now applied to security aspects of software development, resulting in the automation of security-enhancing activities. In particular, it is common practice to use automated security testing tools that generate reports after inspecting a software artifact from multiple perspectives. However, this raises the challenge of generating duplicate security findings. To identify these duplicate findings manually, a security expert has to invest resources like time, effort, and knowledge. A partial automation of this process could reduce the analysis effort, encourage DevOps principles, and diminish the chance of human error. In this study,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Information and Cyber Security · Software Engineering Techniques and Practices
