Revisiting Self-Training for Neural Sequence Generation

Junxian He; Jiatao Gu; Jiajun Shen; Marc'Aurelio Ranzato

arXiv:1909.13788·cs.LG·October 20, 2020·139 cites

Revisiting Self-Training for Neural Sequence Generation

Junxian He, Jiatao Gu, Jiajun Shen, Marc'Aurelio Ranzato

PDF

Open Access 1 Repo

TL;DR

This paper revisits self-training for neural sequence generation, demonstrating its effectiveness and introducing a noisy self-training method that leverages unlabeled data to significantly improve performance.

Contribution

It empirically shows self-training benefits in sequence tasks and proposes input noise injection to enhance unlabeled data utilization.

Findings

01

Self-training improves neural sequence generation performance.

02

Dropout acts as a regularizer, aiding self-training.

03

Noisy self-training significantly boosts results on benchmarks.

Abstract

Self-training is one of the earliest and simplest semi-supervised methods. The key idea is to augment the original labeled dataset with unlabeled data paired with the model's prediction (i.e. the pseudo-parallel data). While self-training has been extensively studied on classification problems, in complex sequence generation tasks (e.g. machine translation) it is still unclear how self-training works due to the compositionality of the target space. In this work, we first empirically show that self-training is able to decently improve the supervised baseline on neural sequence generation tasks. Through careful examination of the performance gains, we find that the perturbation on the hidden states (i.e. dropout) is critical for self-training to benefit from the pseudo-parallel data, which acts as a regularizer and forces the model to yield close predictions for similar unlabeled inputs.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jxhe/self-training-text-generation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis