A Machine Learning Approach for Hierarchical Classification of Software Requirements
Manal Binkhonain, Liping Zhao

TL;DR
This paper introduces HC4RC, a novel hierarchical machine learning method that effectively addresses class imbalance and high-dimensional, low-sample-size issues in software requirements classification, outperforming traditional and deep learning approaches.
Contribution
The paper presents HC4RC, a new hierarchical ML approach that combines semantic feature selection and dataset decomposition to improve classification under challenging data conditions.
Findings
HC4RC effectively handles class imbalance and HDLSS problems.
HC4RC outperforms traditional statistical and deep learning models.
The approach is simple to implement and practical for real-world RE tasks.
Abstract
Context: Classification of software requirements into different categories is a critically important task in requirements engineering (RE). Developing machine learning (ML) approaches for requirements classification has attracted great interest in the RE community since the 2000s. Objective: This paper aims to address two related problems that have been challenging real-world applications of ML approaches: the problems of class imbalance and high dimensionality with low sample size data (HDLSS). These problems can greatly degrade the classification performance of ML methods. Method: The paper proposes HC4RC, a novel ML approach for multiclass classification of requirements. HC4RC solves the aforementioned problems through semantic-role-based feature selection, dataset decomposition and hierarchical classification. We experimentally compare the effectiveness of HC4RC with three closely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Imbalanced Data Classification Techniques
