Exploring the Learning Difficulty of Data Theory and Measure
Weiyao Zhu, Ou Wu, Fengguang Su, and Yingjun Deng

TL;DR
This paper provides a rigorous theoretical foundation for understanding and measuring learning difficulty in machine learning, proposing a formal definition, properties, and a practical measure that outperforms existing heuristics.
Contribution
It introduces a formal theoretical definition of learning difficulty based on bias-variance trade-off, along with properties and a practical measure, filling a gap in existing heuristic approaches.
Findings
The proposed measure outperforms existing difficulty measures in experiments.
Classical weighting methods can be explained through the properties of the new difficulty measure.
Theoretical insights clarify the role of difficulty in weighting strategies.
Abstract
As learning difficulty is crucial for machine learning (e.g., difficulty-based weighting learning strategies), previous literature has proposed a number of learning difficulty measures. However, no comprehensive investigation for learning difficulty is available to date, resulting in that nearly all existing measures are heuristically defined without a rigorous theoretical foundation. In addition, there is no formal definition of easy and hard samples even though they are crucial in many studies. This study attempts to conduct a pilot theoretical study for learning difficulty of samples. First, a theoretical definition of learning difficulty is proposed on the basis of the bias-variance trade-off theory on generalization error. Theoretical definitions of easy and hard samples are established on the basis of the proposed definition. A practical measure of learning difficulty is given as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Grey System Theory Applications
