Learning to Simplify with Data Hopelessly Out of Alignment

Tadashi Nomoto

arXiv:2204.00741·cs.CL·April 5, 2022

Learning to Simplify with Data Hopelessly Out of Alignment

Tadashi Nomoto

PDF

Open Access

TL;DR

This paper explores unsupervised text simplification using novel neural architectures like Conjoined Twin Networks, FFA, and GANs, demonstrating superior performance over existing methods without relying on parallel corpora.

Contribution

Introduces Conjoined Twin Networks, Flip-Flop Auto-Encoders, and compares GAN variants, advancing unsupervised text simplification techniques.

Findings

01

JS-GAN outperforms Wasserstein GAN in this context

02

Twin Networks with FFA and JS-GAN outperform previous best systems

03

Supervision-free methods produce qualitatively different simplified sentences

Abstract

We consider whether it is possible to do text simplification without relying on a "parallel" corpus, one that is made up of sentence-by-sentence alignments of complex and ground truth simple sentences. To this end, we introduce a number of concepts, some new and some not, including what we call Conjoined Twin Networks, Flip-Flop Auto-Encoders (FFA) and Adversarial Networks (GAN). A comparison is made between Jensen-Shannon (JS-GAN) and Wasserstein GAN, to see how they impact performance, with stronger results for the former. An experiment we conducted with a large dataset derived from Wikipedia found the solid superiority of Twin Networks equipped with FFA and JS-GAN, over the current best performing system. Furthermore, we discuss where we stand in a relation to fully supervised methods in the past literature, and highlight with examples qualitative differences that exist among…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Natural Language Processing Techniques · Topic Modeling