Generating Privacy Stories From Software Documentation
Wilder Baldwin, Shashank Chintakuntla, Shreyah Parajuli, Ali Pourghasemi, Ryan Shanz, Sepideh Ghanavati

TL;DR
This paper presents a novel method using advanced language models to extract privacy behaviors from software documents and generate privacy user stories, improving privacy requirement development during software lifecycle.
Contribution
It introduces a new approach combining chain-of-thought prompting, in-context learning, and LLMs to extract privacy behaviors and generate privacy requirements from software documentation.
Findings
LLMs like GPT-4o and Llama 3 achieve F1 scores over 0.8 in identifying privacy behaviors.
Model performance can be enhanced through parameter tuning.
The approach aids in integrating privacy considerations early in software development.
Abstract
Research shows that analysts and developers consider privacy as a security concept or as an afterthought, which may lead to non-compliance and violation of users' privacy. Most current approaches, however, focus on extracting legal requirements from the regulations and evaluating the compliance of software and processes with them. In this paper, we develop a novel approach based on chain-of-thought prompting (CoT), in-context-learning (ICL), and Large Language Models (LLMs) to extract privacy behaviors from various software documents prior to and during software development, and then generate privacy requirements in the format of user stories. Our results show that most commonly used LLMs, such as GPT-4o and Llama 3, can identify privacy behaviors and generate privacy user stories with F1 scores exceeding 0.8. We also show that the performance of these models could be improved through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy, Security, and Data Protection · Information and Cyber Security · Software Engineering Research
