Estimation of Predictive Performance in High-Dimensional Data Settings using Learning Curves
Jeroen M. Goedhart, Thomas Klausch, Mark A. van de Wiel

TL;DR
This paper introduces Learn2Evaluate, a novel framework using learning curves to reliably estimate test performance in high-dimensional data, providing graphical insights, performance extrapolation, and confidence bounds.
Contribution
The paper presents a new performance estimation framework, Learn2Evaluate, that improves reliability and interpretability of test performance estimates in high-dimensional settings.
Findings
Learn2Evaluate provides accurate performance estimates in simulations.
The framework offers a graphical overview of model performance.
It includes a theoretically justified lower confidence bound.
Abstract
In high-dimensional prediction settings, it remains challenging to reliably estimate the test performance. To address this challenge, a novel performance estimation framework is presented. This framework, called Learn2Evaluate, is based on learning curves by fitting a smooth monotone curve depicting test performance as a function of the sample size. Learn2Evaluate has several advantages compared to commonly applied performance estimation methodologies. Firstly, a learning curve offers a graphical overview of a learner. This overview assists in assessing the potential benefit of adding training samples and it provides a more complete comparison between learners than performance estimates at a fixed subsample size. Secondly, a learning curve facilitates in estimating the performance at the total sample size rather than a subsample size. Thirdly, Learn2Evaluate allows the computation of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Data Stream Mining Techniques
MethodsTest
