Improving Deep Learning Library Testing with Machine Learning

Facundo Molina; M M Abid Naziri; Feiran Qin; Alessandra Gorla; Marcelo d'Amorim

arXiv:2602.03755·cs.SE·February 4, 2026

Improving Deep Learning Library Testing with Machine Learning

Facundo Molina, M M Abid Naziri, Feiran Qin, Alessandra Gorla, Marcelo d'Amorim

PDF

Open Access

TL;DR

This paper presents a machine learning approach using tensor shape classifiers to improve bug detection in deep learning libraries, significantly increasing testing effectiveness.

Contribution

It introduces a novel ML-based method for API input validation using tensor shapes, enhancing bug-finding accuracy in DL libraries.

Findings

01

Classifiers achieve over 91% accuracy on unseen data.

02

Integration with ACETest doubles the bug detection pass rate.

03

Shape abstraction reduces problem complexity for ML training.

Abstract

Deep Learning (DL) libraries like TensorFlow and Pytorch simplify machine learning (ML) model development but are prone to bugs due to their complex design. Bug-finding techniques exist, but without precise API specifications, they produce many false alarms. Existing methods to mine API specifications lack accuracy. We explore using ML classifiers to determine input validity. We hypothesize that tensor shapes are a precise abstraction to encode concrete inputs and capture relationships of the data. Shape abstraction severely reduces problem dimensionality, which is important to facilitate ML training. Labeled data are obtained by observing runtime outcomes on a sample of inputs and classifiers are trained on sets of labeled inputs to capture API constraints. Our evaluation, conducted over 183 APIs from TensorFlow and Pytorch, shows that the classifiers generalize well on unseen data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification