The Labeling Distribution Matrix (LDM): A Tool for Estimating Machine   Learning Algorithm Capacity

Pedro Sandoval Segura; Julius Lauw; Daniel Bashir; Kinjal Shah; Sonia; Sehra; Dominique Macias; and George Montanez

arXiv:1912.10597·cs.LG·March 19, 2020

The Labeling Distribution Matrix (LDM): A Tool for Estimating Machine Learning Algorithm Capacity

Pedro Sandoval Segura, Julius Lauw, Daniel Bashir, Kinjal Shah, Sonia, Sehra, Dominique Macias, and George Montanez

PDF

TL;DR

This paper introduces the Labeling Distribution Matrix (LDM) as a novel tool to estimate the capacity of supervised learning algorithms by analyzing their output diversity across datasets, providing insights into their memorization and responsiveness.

Contribution

The paper presents the LDM and Label Recorder as new methods for estimating algorithm capacity, offering a way to quantify flexibility and memorization in supervised learning models.

Findings

01

LDM offers potential insights into algorithm behavior.

02

Initial results with Label Recorder are promising.

03

LDM helps distinguish between memorization and generalization.

Abstract

Algorithm performance in supervised learning is a combination of memorization, generalization, and luck. By estimating how much information an algorithm can memorize from a dataset, we can set a lower bound on the amount of performance due to other factors such as generalization and luck. With this goal in mind, we introduce the Labeling Distribution Matrix (LDM) as a tool for estimating the capacity of learning algorithms. The method attempts to characterize the diversity of possible outputs by an algorithm for different training datasets, using this to measure algorithm flexibility and responsiveness to data. We test the method on several supervised learning algorithms, and find that while the results are not conclusive, the LDM does allow us to gain potentially valuable insight into the prediction behavior of algorithms. We also introduce the Label Recorder as an additional tool for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest