The Potential Benefits of Filtering Versus Hyper-Parameter Optimization

Michael R. Smith; Tony Martinez; Christophe Giraud-Carrier

arXiv:1403.3342·stat.ML·March 14, 2014·1 cites

The Potential Benefits of Filtering Versus Hyper-Parameter Optimization

Michael R. Smith, Tony Martinez, Christophe Giraud-Carrier

PDF

Open Access

TL;DR

This paper compares the potential benefits of data filtering and hyper-parameter optimization in improving model quality, finding that filtering has a greater potential impact.

Contribution

It provides an empirical comparison estimating the maximum potential benefits of filtering versus hyper-parameter tuning.

Findings

01

Filtering has a greater potential effect than hyper-parameter optimization.

02

Both methods significantly improve model quality.

03

Estimations suggest filtering can achieve higher improvements.

Abstract

The quality of an induced model by a learning algorithm is dependent on the quality of the training data and the hyper-parameters supplied to the learning algorithm. Prior work has shown that improving the quality of the training data (i.e., by removing low quality instances) or tuning the learning algorithm hyper-parameters can significantly improve the quality of an induced model. A comparison of the two methods is lacking though. In this paper, we estimate and compare the potential benefits of filtering and hyper-parameter optimization. Estimating the potential benefit gives an overly optimistic estimate but also empirically demonstrates an approximation of the maximum potential benefit of each method. We find that, while both significantly improve the induced model, improving the quality of the training set has a greater potential effect than hyper-parameter optimization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Industrial Vision Systems and Defect Detection