LeanML: A Design Pattern To Slash Avoidable Wastes in Machine Learning   Projects

Yves-Laurent Kom Samo

arXiv:2107.08066·cs.LG·August 13, 2021·1 cites

LeanML: A Design Pattern To Slash Avoidable Wastes in Machine Learning Projects

Yves-Laurent Kom Samo

PDF

Open Access

TL;DR

This paper applies lean methodology to machine learning, proposing a pattern that estimates the maximum achievable performance based on information theory, thereby reducing waste and risk in ML projects.

Contribution

It introduces a novel lean design pattern for ML that estimates optimal performance without training models, based on mutual information and data variability.

Findings

01

The pattern accurately predicts maximum performance metrics.

02

It reduces time and cost in ML project evaluation.

03

Demonstrated effectiveness on diverse datasets.

Abstract

We introduce the first application of the lean methodology to machine learning projects. Similar to lean startups and lean manufacturing, we argue that lean machine learning (LeanML) can drastically slash avoidable wastes in commercial machine learning projects, reduce the business risk in investing in machine learning capabilities and, in so doing, further democratize access to machine learning. The lean design pattern we propose in this paper is based on two realizations. First, it is possible to estimate the best performance one may achieve when predicting an outcome $y \in Y$ using a given set of explanatory variables $x \in X$ , for a wide range of performance metrics, and without training any predictive model. Second, doing so is considerably easier, faster, and cheaper than learning the best predictive model. We derive formulae expressing the best $R^{2}$ , MSE,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Process Monitoring · Big Data and Business Intelligence · Forecasting Techniques and Applications