Principled Paraphrase Generation with Parallel Corpora
Aitor Ormazabal, Mikel Artetxe, Aitor Soroa, Gorka Labaka, Eneko, Agirre

TL;DR
This paper introduces a novel, principled method for paraphrase generation that improves upon round-trip machine translation by using an information bottleneck approach, resulting in better quality and control over paraphrase diversity.
Contribution
It proposes an alternative similarity metric and a new training method with an adversarial term, enabling more effective and efficient paraphrase generation without pivot translations.
Findings
Outperforms round-trip MT in quality and efficiency
Provides adjustable control over fidelity-diversity trade-off
Achieves better results in experiments
Abstract
Round-trip Machine Translation (MT) is a popular choice for paraphrase generation, which leverages readily available parallel corpora for supervision. In this paper, we formalize the implicit similarity function induced by this approach, and show that it is susceptible to non-paraphrase pairs sharing a single ambiguous translation. Based on these insights, we design an alternative similarity metric that mitigates this issue by requiring the entire translation distribution to match, and implement a relaxation of it through the Information Bottleneck method. Our approach incorporates an adversarial term into MT training in order to learn representations that encode as much information about the reference translation as possible, while keeping as little information about the input as possible. Paraphrases can be generated by decoding back to the source from this representation, without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Text Analysis Techniques
