Predicting Issue Types with seBERT
Alexander Trautsch, Steffen Herbold

TL;DR
This paper introduces seBERT, a transformer-based model trained on software engineering data, which outperforms baseline methods in predicting issue types with high accuracy.
Contribution
The paper presents seBERT, a novel pre-trained transformer model specifically trained on software engineering data for issue type prediction, achieving superior performance.
Findings
seBERT achieves an F1-score of 85.7%.
seBERT outperforms fastText baseline in recall and precision.
The model demonstrates strong effectiveness across all issue types.
Abstract
Pre-trained transformer models are the current state-of-the-art for natural language models processing. seBERT is such a model, that was developed based on the BERT architecture, but trained from scratch with software engineering data. We fine-tuned this model for the NLBSE challenge for the task of issue type prediction. Our model dominates the baseline fastText for all three issue types in both recall and precisio} to achieve an overall F1-score of 85.7%, which is an increase of 4.1% over the baseline.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Software System Performance and Reliability
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Softmax · Weight Decay · Adam · Attention Dropout · Dense Connections · Dropout · Linear Warmup With Linear Decay
