User Reviews as a Source for Usability Requirements: A Precursor Study on Using Large Language Models
Cedric Wellhausen, Laura Reinhardt, Kurt Schneider

TL;DR
This study investigates whether large language models can effectively analyze user reviews for usability requirements, offering a quick, cost-effective alternative to human analysis with promising initial results.
Contribution
It introduces a dataset of user reviews labeled by humans and LLMs, develops a prompt for usability detection, and evaluates LLM performance compared to human raters.
Findings
LLMs can recognize usability as a non-functional requirement in reviews.
Performance of LLMs depends heavily on prompt quality.
A dataset of 300 labeled user reviews is provided.
Abstract
It is known that user-centered approaches to requirements engineering in general lead to a better suited product for the end-users. LLM4RE provides promising approaches to support the requirements elicitation process (e.g. classification of requirements). Previous approaches focus on Machine-Learning (ML) or Deep-Learning (DL) aspects, which require intensive training with a large amount of manually labeled data. LLMs, on the other hand, are pre-trained on large amounts of user-generated text data, enabling a user-centric workflow to analyze requirements. In this paper, we explore the possibility of exploiting the improved natural language understanding of LLMs, rather than strict ML classification, together with the mass extraction of user reviews to analyze if the performance of LLMs in understanding user reviews is comparable to the performance of human raters. This enables a quick…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
