Generalizability of experimental studies

Federico Matteucci; Vadim Arzamasov; Jose Cribeiro-Ramallo; Marco Heyden; Konstantin Ntounas; Klemens B\"ohm

arXiv:2406.17374·cs.LG·December 5, 2025·2 cites

Generalizability of experimental studies

Federico Matteucci, Vadim Arzamasov, Jose Cribeiro-Ramallo, Marco Heyden, Konstantin Ntounas, Klemens B\"ohm

PDF

Open Access 1 Repo

TL;DR

This paper formalizes the concept of generalizability in ML experiments, introduces a framework to quantify it, and provides a practical tool for researchers to evaluate how well their results extend beyond initial studies.

Contribution

It offers a novel formalization of ML experimental generalizability, a framework for quantification, and a Python package for practical evaluation.

Findings

01

Framework provides insights into the number of experiments needed for generalizability

02

Use of rankings and Maximum Mean Discrepancy for measurement

03

Tool aids researchers in assessing study robustness

Abstract

Experimental studies are a cornerstone of Machine Learning (ML) research. A common and often implicit assumption is that the study's results will generalize beyond the study itself, e.g., to new data. That is, repeating the same study under different conditions will likely yield similar results. Existing frameworks to measure generalizability, borrowed from the casual inference literature, cannot capture the complexity of the results and the goals of an ML study. The problem of measuring generalizability in the more general ML setting is thus still open, also due to the lack of a mathematical formalization of experimental studies. In this paper, we propose such a formalization, use it to develop a framework to quantify generalizability, and propose an instantiation based on rankings and the Maximum Mean Discrepancy. We show how our framework offers insights into the number of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

DrCohomology/genexpy
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFault Detection and Control Systems