Using Language Models for Enhancing the Completeness of Natural-language Requirements
Dipeeka Luitel, Shabnam Hassani, Mehrdad Sabetzadeh

TL;DR
This paper explores using BERT, a language model, to identify missing information in natural-language requirements by predicting omitted content, with a focus on optimizing prediction strategies and reducing noise for better completeness detection.
Contribution
It introduces an optimal number of predictions per mask for BERT and a machine learning filter to improve the accuracy of missing content detection in requirements.
Findings
BERT effectively predicts missing terminology in requirements.
The filter significantly reduces noise in BERT's predictions.
Optimal predictions per mask balance discovery and noise.
Abstract
[Context and motivation] Incompleteness in natural-language requirements is a challenging problem. [Question/problem] A common technique for detecting incompleteness in requirements is checking the requirements against external sources. With the emergence of language models such as BERT, an interesting question is whether language models are useful external sources for finding potential incompleteness in requirements. [Principal ideas/results] We mask words in requirements and have BERT's masked language model (MLM) generate contextualized predictions for filling the masked slots. We simulate incompleteness by withholding content from requirements and measure BERT's ability to predict terminology that is present in the withheld content but absent in the content disclosed to BERT. [Contribution] BERT can be configured to generate multiple predictions per mask. Our first contribution is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Natural Language Processing Techniques · Topic Modeling
