The Labeling Distribution Matrix (LDM): A Tool for Estimating Machine Learning Algorithm Capacity
Pedro Sandoval Segura, Julius Lauw, Daniel Bashir, Kinjal Shah, Sonia, Sehra, Dominique Macias, and George Montanez

TL;DR
This paper introduces the Labeling Distribution Matrix (LDM) as a novel tool to estimate the capacity of supervised learning algorithms by analyzing their output diversity across datasets, providing insights into their memorization and responsiveness.
Contribution
The paper presents the LDM and Label Recorder as new methods for estimating algorithm capacity, offering a way to quantify flexibility and memorization in supervised learning models.
Findings
LDM offers potential insights into algorithm behavior.
Initial results with Label Recorder are promising.
LDM helps distinguish between memorization and generalization.
Abstract
Algorithm performance in supervised learning is a combination of memorization, generalization, and luck. By estimating how much information an algorithm can memorize from a dataset, we can set a lower bound on the amount of performance due to other factors such as generalization and luck. With this goal in mind, we introduce the Labeling Distribution Matrix (LDM) as a tool for estimating the capacity of learning algorithms. The method attempts to characterize the diversity of possible outputs by an algorithm for different training datasets, using this to measure algorithm flexibility and responsiveness to data. We test the method on several supervised learning algorithms, and find that while the results are not conclusive, the LDM does allow us to gain potentially valuable insight into the prediction behavior of algorithms. We also introduce the Label Recorder as an additional tool for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest
