Evaluating Language Model Applications for Identifying Solution-Related Content in Issue Report Discussions
Antu Saha, Mehedi Sun, and Oscar Chaparro

TL;DR
This paper explores automated methods using language models to identify solution-related content in issue discussions, improving efficiency in software maintenance and understanding.
Contribution
It compares various language model approaches, including embeddings, prompting, and fine-tuning, for solution classification in issue reports, and demonstrates their effectiveness.
Findings
Fine-tuned LLMs achieve highest F1 score of 0.716.
Ensemble models improve F1 score to 0.737.
Models trained on Mozilla data transfer well to other projects with minimal data.
Abstract
During issue resolution, software developers rely on issue reports to discuss solutions for defects, feature requests, and other changes. These discussions contain proposed solutions--from design changes to code implementations--as well as their evaluations. Locating solution-related content is essential for investigating reopened issues, addressing regressions, reusing solutions, and understanding code change rationale. Manually understanding long discussions to identify such content can be difficult and time-consuming. This paper automates solution identification using language models as supervised classifiers. We investigate three applications--embeddings, prompting, and fine-tuning--across three classifier types: traditional ML models (MLMs), pre-trained language models (PLMs), and large language models (LLMs). Using 356 Mozilla Firefox issues, we created a dataset to train and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Topic Modeling
