In-sample Curriculum Learning by Sequence Completion for Natural   Language Generation

Qi Jia; Yizhu Liu; Haifeng Tang; Kenny Q. Zhu

arXiv:2211.11297·cs.CL·May 24, 2023

In-sample Curriculum Learning by Sequence Completion for Natural Language Generation

Qi Jia, Yizhu Liu, Haifeng Tang, Kenny Q. Zhu

PDF

Open Access 1 Repo

TL;DR

This paper introduces an in-sample curriculum learning approach for natural language generation that progressively trains models from generating the last words to the entire sequence, improving performance across tasks.

Contribution

It proposes a task-agnostic in-sample curriculum learning method based on sequence completion, avoiding reliance on task-specific difficulty scoring.

Findings

01

Significant performance improvements over strong baselines

02

Effective generalization across multiple NLP tasks

03

Demonstrates the viability of in-sample curriculum learning

Abstract

Curriculum learning has shown promising improvements in multiple domains by training machine learning models from easy samples to hard ones. Previous works which either design rules or train models for scoring the difficulty highly rely on task-specific expertise, and cannot generalize. Inspired by the "easy-to-hard" intuition, we propose to do in-sample curriculum learning for natural language generation tasks. Our learning strategy starts training the model to generate the last few words, i.e., do sequence completion, and gradually extends to generate the whole output sequence. Comprehensive experiments show that it generalizes well to different tasks and achieves significant improvements over strong baselines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiaqisjtu/insamplecurriculumlearning
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications