Learning the Curriculum with Bayesian Optimization for Task-Specific   Word Representation Learning

Yulia Tsvetkov; Manaal Faruqui; Wang Ling; Brian MacWhinney; Chris; Dyer

arXiv:1605.03852·cs.CL·June 22, 2016

Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning

Yulia Tsvetkov, Manaal Faruqui, Wang Ling, Brian MacWhinney, Chris, Dyer

PDF

TL;DR

This paper introduces a method using Bayesian optimization to learn effective curricula for word representation learning, significantly enhancing downstream task performance by optimizing training instance order.

Contribution

It proposes a novel curriculum learning approach with a linear ranking model optimized via Bayesian methods, improving over random and natural data orders.

Findings

01

Learning the curriculum improves downstream task performance.

02

Bayesian optimization effectively learns task-specific curricula.

03

The method outperforms random and natural ordering baselines.

Abstract

We use Bayesian optimization to learn curricula for word representation learning, optimizing performance on downstream tasks that depend on the learned representations as features. The curricula are modeled by a linear ranking function which is the scalar product of a learned weight vector and an engineered feature vector that characterizes the different aspects of the complexity of each instance in the training corpus. We show that learning the curriculum improves performance on a variety of downstream tasks over random orders and in comparison to the natural corpus order.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.