Unsupervised Natural Language Generation with Denoising Autoencoders

Markus Freitag; Scott Roy

arXiv:1804.07899·cs.CL·August 28, 2018

Unsupervised Natural Language Generation with Denoising Autoencoders

Markus Freitag, Scott Roy

PDF

1 Repo

TL;DR

This paper presents an unsupervised approach to natural language generation using denoising autoencoders, achieving higher performance than supervised methods without labeled data.

Contribution

It introduces a novel unsupervised method that interprets structured data as corrupted input and uses denoising autoencoders to generate coherent text, outperforming supervised approaches.

Findings

01

Unsupervised NLG surpasses supervised methods in certain domains.

02

Denoising autoencoders effectively generate correct sentences from structured data.

03

Training with noise enables generalization to structured data inputs.

Abstract

Generating text from structured data is important for various tasks such as question answering and dialog systems. We show that in at least one domain, without any supervision and only based on unlabeled text, we are able to build a Natural Language Generation (NLG) system with higher performance than supervised approaches. In our approach, we interpret the structured data as a corrupt representation of the desired output and use a denoising auto-encoder to reconstruct the sentence. We show how to introduce noise into training examples that do not contain structured data, and that the resulting denoising auto-encoder generalizes to generate correct sentences when given structured data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mcleonard/NLG_Autoencoder
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.