Learning Off-By-One Mistakes: An Empirical Study

Hendrig Sellik; Onno van Paridon; Georgios Gousios and; Maur\'icio Aniche

arXiv:2102.12429·cs.SE·February 25, 2021

Learning Off-By-One Mistakes: An Empirical Study

Hendrig Sellik, Onno van Paridon, Georgios Gousios and, Maur\'icio Aniche

PDF

1 Repo

TL;DR

This study investigates the use of deep learning models to detect boundary condition errors in software, achieving high accuracy on synthetic data but limited success on real-world and industrial code.

Contribution

It provides an empirical evaluation of deep learning for boundary mistake detection, highlighting challenges in real-world application.

Findings

01

High precision (85%) and recall (84%) on balanced synthetic dataset

02

Lower detection performance on imbalanced and real-world datasets

03

Limited success in industrial code validation with no confirmed bugs

Abstract

Mistakes in binary conditions are a source of error in many software systems. They happen when developers use, e.g., < or > instead of <= or >=. These boundary mistakes are hard to find and impose manual, labor-intensive work for software developers. While previous research has been proposing solutions to identify errors in boundary conditions, the problem remains open. In this paper, we explore the effectiveness of deep learning models in learning and predicting mistakes in boundary conditions. We train different models on approximately 1.6M examples with faults in different boundary conditions. We achieve a precision of 85% and a recall of 84% on a balanced dataset, but lower numbers in an imbalanced dataset. We also perform tests on 41 real-world boundary condition bugs found from GitHub, where the model shows only a modest performance. Finally, we test the model on a large-scale…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hsellik/thesis
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.