Theory of Curriculum Learning, with Convex Loss Functions
Daphna Weinshall, Dan Amir

TL;DR
This paper provides a theoretical analysis of curriculum learning using convex loss functions, showing how difficulty scores influence convergence rates in linear regression and binary classification.
Contribution
It introduces a formal definition of difficulty score and analyzes its impact on convergence in convex learning problems, bridging empirical heuristics with theory.
Findings
Expected convergence rate decreases with difficulty score
Convergence rate increases with current hypothesis loss at data points
Results reconcile curriculum learning and hard data mining heuristics
Abstract
Curriculum Learning - the idea of teaching by gradually exposing the learner to examples in a meaningful order, from easy to hard, has been investigated in the context of machine learning long ago. Although methods based on this concept have been empirically shown to improve performance of several learning algorithms, no theoretical analysis has been provided even for simple cases. To address this shortfall, we start by formulating an ideal definition of difficulty score - the loss of the optimal hypothesis at a given datapoint. We analyze the possible contribution of curriculum learning based on this score in two convex problems - linear regression, and binary classification by hinge loss minimization. We show that in both cases, the expected convergence rate decreases monotonically with the ideal difficulty score, in accordance with earlier empirical results. We also prove that when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning
