Variability-Aware Machine Learning Model Selection: Feature Modeling,   Instantiation, and Experimental Case Study

Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan

arXiv:2501.00532·cs.SE·January 3, 2025

Variability-Aware Machine Learning Model Selection: Feature Modeling, Instantiation, and Experimental Case Study

Cristina Tavares, Nathalia Nascimento, Paulo Alencar, Donald Cowan

PDF

Open Access

TL;DR

This paper introduces a variability-aware approach for machine learning model selection that explicitly captures contextual factors and their dependencies, aiming to improve transparency, adaptability, and automation in the selection process.

Contribution

It proposes a formal, design-oriented method for ML model selection that considers variability and dependencies, supported by an experimental case study with Scikit-Learn heuristics.

Findings

01

The approach improves transparency and interpretability of model selection.

02

It outperforms traditional ad hoc selection methods in the case study.

03

The method supports automation and adaptability in ML workflows.

Abstract

The emergence of machine learning (ML) has led to a transformative shift in software techniques and guidelines for building software applications that support data analysis process activities such as data ingestion, modeling, and deployment. Specifically, this shift is impacting ML model selection, which is one of the key phases in this process. There have been several advances in model selection from the standpoint of core ML methods, including basic probability measures and resampling methods. However, from a software engineering perspective, this selection is still an ad hoc and informal process, is not supported by a design approach and representation formalism that explicitly captures the selection process and can not support the specification of existing model selection procedures. The selection adapts to a variety of contextual factors that affect the model selection, such as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification