Towards Extracting Software Requirements from App Reviews using Seq2seq Framework
Aakash Sorathiya, Gouri Ginde

TL;DR
This paper introduces a novel Seq2seq-based framework for extracting software requirements from app reviews, addressing informal language and irrelevant info, and demonstrating superior performance over existing methods.
Contribution
The paper reformulates requirements extraction as a NER task using a Seq2seq model with attention, GloVe embeddings, and CRF, improving accuracy on large review datasets.
Findings
Outperformed state-of-the-art methods with F1 score of 0.96 on large dataset.
Achieved comparable performance with existing methods on smaller dataset.
Effectively handled informal language and irrelevant information in reviews.
Abstract
Mobile app reviews are a large-scale data source for software improvements. A key task in this context is effectively extracting requirements from app reviews to analyze the users' needs and support the software's evolution. Recent studies show that existing methods fail at this task since app reviews usually contain informal language, grammatical and spelling errors, and a large amount of irrelevant information that might not have direct practical value for developers. To address this, we propose a novel reformulation of requirements extraction as a Named Entity Recognition (NER) task based on the sequence-to-sequence (Seq2seq) generation approach. With this aim, we propose a Seq2seq framework, incorporating a BiLSTM encoder and an LSTM decoder, enhanced with a self-attention mechanism, GloVe embeddings, and a CRF model. We evaluated our framework on two datasets: a manually annotated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
