API design for machine learning software: experiences from the scikit-learn project
Lars Buitinck (ILPS), Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, (INRIA Saclay - Ile de France), Andreas Mueller, Olivier Grisel, Vlad, Niculae, Peter Prettenhofer, Alexandre Gramfort (INRIA Saclay - Ile de, France, LTCI), Jaques Grobler (INRIA Saclay - Ile de France)

TL;DR
This paper discusses the design choices of the scikit-learn API, emphasizing simplicity, reusability, and user accessibility, and shares insights from practical implementation and user challenges.
Contribution
It introduces a unified, simple API for machine learning components in scikit-learn, highlighting its advantages and implementation considerations within Python.
Findings
Unified API enhances composition and reusability.
Design choices improve accessibility for non-experts.
Analysis of user and developer obstacles in Python ecosystem.
Abstract
Scikit-learn is an increasingly popular machine learning li- brary. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design choices for the application programming interface (API) of the project. In particular, we describe the simple and elegant interface shared by all learning and processing units in the library and then discuss its advantages in terms of composition and reusability. The paper also comments on implementation details specific to the Python ecosystem and analyzes obstacles faced by users and developers of the library.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Computational Physics and Python Applications · Machine Learning and Data Classification
