Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning
Yulia Tsvetkov, Manaal Faruqui, Wang Ling, Brian MacWhinney, Chris, Dyer

TL;DR
This paper introduces a method using Bayesian optimization to learn effective curricula for word representation learning, significantly enhancing downstream task performance by optimizing training instance order.
Contribution
It proposes a novel curriculum learning approach with a linear ranking model optimized via Bayesian methods, improving over random and natural data orders.
Findings
Learning the curriculum improves downstream task performance.
Bayesian optimization effectively learns task-specific curricula.
The method outperforms random and natural ordering baselines.
Abstract
We use Bayesian optimization to learn curricula for word representation learning, optimizing performance on downstream tasks that depend on the learned representations as features. The curricula are modeled by a linear ranking function which is the scalar product of a learned weight vector and an engineered feature vector that characterizes the different aspects of the complexity of each instance in the training corpus. We show that learning the curriculum improves performance on a variety of downstream tasks over random orders and in comparison to the natural corpus order.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
