An Evalutation of Programming Language Models' performance on Software Defect Detection
Kailun Wang

TL;DR
This paper evaluates various language models, including BERT, on software defect detection across multiple levels, providing a reproducible tool chain and demonstrating BERT's superior performance in this domain.
Contribution
It introduces a comprehensive evaluation of language models for software defect detection at multiple levels and offers a reproducible tool chain for future research.
Findings
BERT outperformed other models in defect detection tasks.
The study covers syntactical, algorithmic, and general defect levels.
A reproducible tool chain was developed for experiments.
Abstract
This dissertation presents an evaluation of several language models on software defect datasets. A language Model (LM) "can provide word representation and probability indication of word sequences as the core component of an NLP system." Language models for source code are specified for tasks in the software engineering field. While some models are directly the NLP ones, others contain structural information that is uniquely owned by source code. Software defects are defects in the source code that lead to unexpected behaviours and malfunctions at all levels. This study provides an original attempt to detect these defects at three different levels (syntactical, algorithmic and general) We also provide a tool chain that researchers can use to reproduce the experiments. We have tested the different models against different datasets, and performed an analysis over the results. Our original…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Advanced Malware Detection Techniques
