Enabling Reproducibility and Meta-learning Through a Lifelong Database of Experiments (LDE)
Jason Tsay, Andrea Bartezzaghi, Aleke Nolte, Cristiano Malossi

TL;DR
The paper introduces the Lifelong Database of Experiments (LDE), a system that automatically captures, stores, and enables analysis of AI experiment metadata to improve reproducibility and meta-learning.
Contribution
It presents a novel system that automates experiment metadata collection, supports reproducibility, and enhances meta-learning through data aggregation and analysis.
Findings
Significant variability in performance metrics across experiments.
Aggregating results improves meta-learning recommendation accuracy.
LDE successfully reproduces and analyzes existing meta-learning studies.
Abstract
Artificial Intelligence (AI) development is inherently iterative and experimental. Over the course of normal development, especially with the advent of automated AI, hundreds or thousands of experiments are generated and are often lost or never examined again. There is a lost opportunity to document these experiments and learn from them at scale, but the complexity of tracking and reproducing these experiments is often prohibitive to data scientists. We present the Lifelong Database of Experiments (LDE) that automatically extracts and stores linked metadata from experiment artifacts and provides features to reproduce these artifacts and perform meta-learning across them. We store context from multiple stages of the AI development lifecycle including datasets, pipelines, how each is configured, and training runs with information about their runtime environment. The standardized nature of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Machine Learning and Data Classification · Advanced Database Systems and Queries
