Differential testing for machine learning: an analysis for classification algorithms beyond deep learning
Steffen Herbold, Steffen Tunkel

TL;DR
This paper investigates the application of differential testing to classification algorithms beyond deep learning, revealing limited feasibility but significant deviations in results across frameworks, impacting library quality assessment.
Contribution
It provides the first analysis of differential testing for classification algorithms outside deep learning, highlighting its potential and limitations.
Findings
Large potential for popular algorithms identified
Feasibility limited due to configuration matching issues
Significant deviations observed in test results
Abstract
Context: Differential testing is a useful approach that uses different implementations of the same algorithms and compares the results for software testing. In recent years, this approach was successfully used for test campaigns of deep learning frameworks. Objective: There is little knowledge on the application of differential testing beyond deep learning. Within this article, we want to close this gap for classification algorithms. Method: We conduct a case study using Scikit-learn, Weka, Spark MLlib, and Caret in which we identify the potential of differential testing by considering which algorithms are available in multiple frameworks, the feasibility by identifying pairs of algorithms that should exhibit the same behavior, and the effectiveness by executing tests for the identified pairs and analyzing the deviations. Results: While we found a large potential for popular…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Statistics Education and Methodologies · Software Engineering Research
MethodsTest
