A Topological-Framework to Improve Analysis of Machine Learning Model Performance
Henry Kvinge, Colby Wight, Sarah Akers, Scott Howland, Woongjo Choi,, Xiaolong Ma, Luke Gosink, Elizabeth Jurrus, Keerti Kappagantula, Tegan H., Emerson

TL;DR
This paper introduces a topological framework that enhances the analysis of machine learning model performance by capturing both global and local behaviors across data subpopulations, addressing limitations of traditional summary statistics.
Contribution
It proposes a novel topological approach using presheaves to organize and analyze model performance across different data subpopulations, improving interpretability.
Findings
Provides a principled method for local performance analysis
Enables better understanding of model failures on subpopulations
Offers a new data structure for performance comparison
Abstract
As both machine learning models and the datasets on which they are evaluated have grown in size and complexity, the practice of using a few summary statistics to understand model performance has become increasingly problematic. This is particularly true in real-world scenarios where understanding model failure on certain subpopulations of the data is of critical importance. In this paper we propose a topological framework for evaluating machine learning models in which a dataset is treated as a "space" on which a model operates. This provides us with a principled way to organize information about model performance at both the global level (over the entire test set) and also the local level (on specific subpopulations). Finally, we describe a topological data structure, presheaves, which offer a convenient way to store and analyze model performance between different subpopulations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Cell Image Analysis Techniques · Clusterin in disease pathology
