A Mathematical Model for Curriculum Learning for Parities
Elisabetta Cornacchia, Elchanan Mossel

TL;DR
This paper presents a mathematical model demonstrating how curriculum learning can significantly reduce the computational cost of training neural networks on parity functions, with specific strategies being effective for some classes but not others.
Contribution
It introduces a formal CL model for learning k-parities and analyzes its advantages, providing mathematical justification for curriculum strategies in neural network training.
Findings
Curriculum learning reduces training cost for k-parities.
Using multiple product distributions enhances learning efficiency.
Curriculum strategies are not beneficial for Hamming mixtures.
Abstract
Curriculum learning (CL) - training using samples that are generated and presented in a meaningful order - was introduced in the machine learning context around a decade ago. While CL has been extensively used and analysed empirically, there has been very little mathematical justification for its advantages. We introduce a CL model for learning the class of k-parities on d bits of a binary string with a neural network trained by stochastic gradient descent (SGD). We show that a wise choice of training examples involving two or more product distributions, allows to reduce significantly the computational cost of learning this class of functions, compared to learning under the uniform distribution. Furthermore, we show that for another class of functions - namely the `Hamming mixtures' - CL strategies involving a bounded number of product distributions are not beneficial.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Neural Networks and Applications · Domain Adaptation and Few-Shot Learning
