Multi-Label Requirements Classification with Large Taxonomies
Waleed Abdeen, Michael Unterkalmsteiner, Krzysztof Wnuk, Alexandros, Chirtoglou, Christoph Schimanski, Heja Goli

TL;DR
This paper explores zero-shot learning for multi-label requirements classification with large taxonomies, evaluating classifier types, hierarchy effects, and taxonomy structure on performance, aiming to improve requirements traceability.
Contribution
It systematically investigates zero-shot learning for large taxonomy multi-label classification in requirements engineering, including a ground truth creation process and analysis of classification strategies.
Findings
Sentence-based classifiers have higher recall than word-based.
Hierarchical classification does not always improve performance.
Taxonomy size negatively correlates with recall in hierarchical classifiers.
Abstract
Classification aids software development activities by organizing requirements in classes for easier access and retrieval. The majority of requirements classification research has, so far, focused on binary or multi-class classification. Multi-label classification with large taxonomies could aid requirements traceability but is prohibitively costly with supervised training. Hence, we investigate zero-short learning to evaluate the feasibility of multi-label requirements classification with large taxonomies. We associated, together with domain experts from the industry, 129 requirements with 769 labels from taxonomies ranging between 250 and 1183 classes. Then, we conducted a controlled experiment to study the impact of the type of classifier, the hierarchy, and the structural characteristics of taxonomies on the classification performance. The results show that: (1) The sentence-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
