Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment
Regina Barzilay, Lillian Lee

TL;DR
This paper introduces an unsupervised method for sentence-level paraphrasing using multiple-sequence alignment on unannotated corpora, resulting in accurate paraphrases that outperform baselines.
Contribution
It presents a novel unsupervised approach leveraging multiple-sequence alignment to learn paraphrasing patterns from unannotated data.
Findings
System derives accurate paraphrases
Outperforms baseline systems
Effective on unannotated comparable corpora
Abstract
We address the text-to-text generation problem of sentence-level paraphrasing -- a phenomenon distinct from and more difficult than word- or phrase-level paraphrasing. Our approach applies multiple-sequence alignment to sentences gathered from unannotated comparable corpora: it learns a set of paraphrasing patterns represented by word lattice pairs and automatically determines how to apply these patterns to rewrite new sentences. The results of our evaluation experiments show that the system derives accurate paraphrases, outperforming baseline systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
