Leveraging Data Characteristics for Bug Localization in Deep Learning   Programs

Ruchira Manke; Mohammad Wardat; Foutse Khomh; Hridesh Rajan

arXiv:2412.05775·cs.SE·December 10, 2024

Leveraging Data Characteristics for Bug Localization in Deep Learning Programs

Ruchira Manke, Mohammad Wardat, Foutse Khomh, Hridesh Rajan

PDF

Open Access

TL;DR

This paper introduces Theia, a tool that detects and localizes structural bugs in deep learning programs by analyzing data characteristics at the start of training, significantly outperforming existing methods.

Contribution

Theia is the first tool to leverage training data characteristics for early bug detection in DL programs across Keras and PyTorch.

Findings

01

Theia localizes 57 out of 75 bugs in real-world DL programs.

02

Theia outperforms NeuraLint, which localizes 17 bugs.

03

Bug localization occurs at the beginning of training, saving time.

Abstract

Deep Learning (DL) is a class of machine learning algorithms that are used in a wide variety of applications. Like any software system, DL programs can have bugs. To support bug localization in DL programs, several tools have been proposed in the past. As most of the bugs that occur due to improper model structure known as structural bugs lead to inadequate performance during training, it is challenging for developers to identify the root cause and address these bugs. To support bug detection and localization in DL programs, in this paper, we propose Theia, which detects and localizes structural bugs in DL programs. Unlike the previous works, Theia considers the training dataset characteristics to automatically detect bugs in DL programs developed using two deep learning libraries, Keras and PyTorch. Since training the DL models is a time-consuming process, Theia detects these bugs at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Parallel Computing and Optimization Techniques