Task Selection for AutoML System Evaluation
Jonathan Lorraine, Nihesh Anderson, Chansoo Lee, Quentin De, Laroussilhe, Mehadi Hassen

TL;DR
This paper introduces a method to select relevant development tasks for AutoML system evaluation by leveraging task descriptors, improving the assessment of system changes on production-like tasks with different distributions.
Contribution
We propose a descriptor-based filtering approach to better align development tasks with production tasks for more accurate AutoML system evaluation.
Findings
Filtering improves evaluation accuracy on holdout tasks
Descriptor-based selection enhances transferability of AutoML improvements
Method reduces discrepancy between development and production task assessments
Abstract
Our goal is to assess if AutoML system changes - i.e., to the search space or hyperparameter optimization - will improve the final model's performance on production tasks. However, we cannot test the changes on production tasks. Instead, we only have access to limited descriptors about tasks that our AutoML system previously executed, like the number of data points or features. We also have a set of development tasks to test changes, ex., sampled from OpenML with no usage constraints. However, the development and production task distributions are different leading us to pursue changes that only improve development and not production. This paper proposes a method to leverage descriptor information about AutoML production tasks to select a filtered subset of the most relevant development tasks. Empirical studies show that our filtering strategy improves the ability to assess AutoML system…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Data Stream Mining Techniques · Text and Document Classification Technologies
MethodsTest
