Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence   Alignment

Regina Barzilay; Lillian Lee

arXiv:cs/0304006·cs.CL·May 23, 2007·98 cites

Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment

Regina Barzilay, Lillian Lee

PDF

Open Access

TL;DR

This paper introduces an unsupervised method for sentence-level paraphrasing using multiple-sequence alignment on unannotated corpora, resulting in accurate paraphrases that outperform baselines.

Contribution

It presents a novel unsupervised approach leveraging multiple-sequence alignment to learn paraphrasing patterns from unannotated data.

Findings

01

System derives accurate paraphrases

02

Outperforms baseline systems

03

Effective on unannotated comparable corpora

Abstract

We address the text-to-text generation problem of sentence-level paraphrasing -- a phenomenon distinct from and more difficult than word- or phrase-level paraphrasing. Our approach applies multiple-sequence alignment to sentences gathered from unannotated comparable corpora: it learns a set of paraphrasing patterns represented by word lattice pairs and automatically determines how to apply these patterns to rewrite new sentences. The results of our evaluation experiments show that the system derives accurate paraphrases, outperforming baseline systems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification