An Evalutation of Programming Language Models' performance on Software   Defect Detection

Kailun Wang

arXiv:1909.10309·cs.SE·September 24, 2019

An Evalutation of Programming Language Models' performance on Software Defect Detection

Kailun Wang

PDF

Open Access 1 Repo

TL;DR

This paper evaluates various language models, including BERT, on software defect detection across multiple levels, providing a reproducible tool chain and demonstrating BERT's superior performance in this domain.

Contribution

It introduces a comprehensive evaluation of language models for software defect detection at multiple levels and offers a reproducible tool chain for future research.

Findings

01

BERT outperformed other models in defect detection tasks.

02

The study covers syntactical, algorithmic, and general defect levels.

03

A reproducible tool chain was developed for experiments.

Abstract

This dissertation presents an evaluation of several language models on software defect datasets. A language Model (LM) "can provide word representation and probability indication of word sequences as the core component of an NLP system." Language models for source code are specified for tasks in the software engineering field. While some models are directly the NLP ones, others contain structural information that is uniquely owned by source code. Software defects are defects in the source code that lead to unexpected behaviours and malfunctions at all levels. This study provides an original attempt to detect these defects at three different levels (syntactical, algorithmic and general) We also provide a tool chain that researchers can use to reproduce the experiments. We have tested the different models against different datasets, and performed an analysis over the results. Our original…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hiroto-takatoshi/ProgLMBug
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Topic Modeling · Advanced Malware Detection Techniques