Error Understanding in Program Code With LLM-DL for Multi-label Classification
Md Faizul Ibne Amin, Yutaka Watanobe, Md. Mostafizer Rahman, Daniel M. Muepu, and Md. Shahajada Mia

TL;DR
This paper explores the use of fine-tuned Large Language Models combined with deep learning architectures to classify multiple error types in programming code, aiming to improve automated error detection and feedback.
Contribution
It introduces a multi-label error classification framework leveraging various LLMs and DL models, demonstrating superior performance in classifying programming errors.
Findings
CodeT5+_GRU achieved the highest weighted F1-score of 0.8243
Model ensemble improved multi-label classification accuracy
Results support combining pretrained encoders with recurrent decoders for error understanding.
Abstract
Programming is a core skill in computer science and software engineering (SE), yet identifying and resolving code errors remains challenging for both novice and experienced developers. While Large Language Models (LLMs) have shown remarkable capabilities in natural language understanding and generation tasks, their potential in domain-specific, complex scenarios, such as multi-label classification (MLC) of programming errors, remains underexplored. Recognizing this less-explored area, this study proposes a multi-label error classification (MLEC) framework for source code that leverages fine-tuned LLMs, including CodeT5-base, GraphCodeBERT, CodeT5+, UniXcoder, RoBERTa, PLBART, and CoTexT. These LLMs are integrated with deep learning (DL) architectures such as GRU, LSTM, BiLSTM, and BiLSTM with an additive attention mechanism (BiLSTM-A) to capture both syntactic and semantic features from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Text and Document Classification Technologies
