Applying Large Language Models to Issue Classification: Revisiting with Extended Data and New Models
Gabriel Aracena, Kyle Luster, Fabio Santos, Igor Steinmacher, Marco A. Gerosa

TL;DR
This paper explores the use of large language models, particularly GPT-4o, for automated issue classification in software engineering, demonstrating high accuracy with limited data and outperforming other models like DeepSeek R1.
Contribution
The study introduces an LLM-based approach for issue classification, comparing multiple models and datasets, highlighting GPT-4o's superior performance and reduced data dependency.
Findings
GPT-4o achieved the highest classification accuracy.
Fine-tuned GPT-4o reached an F1 score of 80.7%.
Increasing dataset size did not improve F1 scores.
Abstract
Effective prioritization of issue reports in software engineering helps to optimize resource allocation and information recovery. However, manual issue classification is laborious and lacks scalability. As an alternative, many open source software (OSS) projects employ automated processes for this task, yet this method often relies on large datasets for adequate training. Traditionally, machine learning techniques have been used for issue classification. More recently, large language models (LLMs) have emerged as powerful tools for addressing a range of software engineering challenges, including code and test generation, mapping new requirements to legacy software endpoints, and conducting code reviews. The following research investigates an automated approach to issue classification based on LLMs. By leveraging the capabilities of such models, we aim to develop a robust system for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Computational and Text Analysis Methods
