Improving Neural Language Modeling via Adversarial Training

Dilin Wang; Chengyue Gong; Qiang Liu

arXiv:1906.03805·cs.LG·September 10, 2019·55 cites

Improving Neural Language Modeling via Adversarial Training

Dilin Wang, Chengyue Gong, Qiang Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces an adversarial training method for neural language models that enhances robustness and reduces overfitting, leading to improved performance on language modeling and translation benchmarks.

Contribution

It presents a novel, efficient adversarial training technique that regularizes neural language models by adding optimal adversarial noise to output embeddings.

Findings

01

Achieved state-of-the-art perplexity scores on PTB and Wikitext-2.

02

Improved BLEU scores on WMT14 English-German and IWSLT14 German-English translation tasks.

03

Demonstrated theoretical benefits in encouraging embedding diversity and model robustness.

Abstract

Recently, substantial progress has been made in language modeling by using deep neural networks. However, in practice, large scale neural language models have been shown to be prone to overfitting. In this paper, we present a simple yet highly effective adversarial training mechanism for regularizing neural language models. The idea is to introduce adversarial noise to the output embedding layer while training the models. We show that the optimal adversarial noise yields a simple closed-form solution, thus allowing us to develop a simple and time efficient algorithm. Theoretically, we show that our adversarial mechanism effectively encourages the diversity of the embedding vectors, helping to increase the robustness of models. Empirically, we show that our method improves on the single model state-of-the-art results for language modeling on Penn Treebank (PTB) and Wikitext-2, achieving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ChengyueGongR/advsoft
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Adversarial Robustness in Machine Learning · Topic Modeling