Heterogeneous Graph Neural Networks for Software Effort Estimation
Hung Phan, Ali Jannesari

TL;DR
This paper introduces HeteroSP, a heterogeneous graph neural network-based tool for estimating software story points from textual issues, outperforming existing models in accuracy and efficiency across various scenarios.
Contribution
HeteroSP is the first to model software issues as heterogeneous graphs for story point estimation, integrating text normalization, graph conversion, and GNN learning.
Findings
HeteroSP achieves lower MAE than baselines in multiple scenarios.
Heterogeneous GNNs outperform homogeneous models in this task.
The approach is significantly faster than existing methods.
Abstract
Software effort can be measured by story point [35]. Current approaches for automatically estimating story points focus on applying pre-trained embedding models and deep learning for text regression to solve this problem which required expensive embedding models. We propose HeteroSP, a tool for estimating story points from textual input of Agile software project issues. We select GPT2SP [12] and Deep-SE [8] as the baselines for comparison. First, from the analysis of the story point dataset [8], we conclude that software issues are actually a mixture of natural language sentences with quoted code snippets and have problems related to large-size vocabulary. Second, we provide a module to normalize the input text including words and code tokens of the software issues. Third, we design an algorithm to convert an input software issue to a graph with different types of nodes and edges.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGraph Neural Network · fastText
