Continuous Decomposition of Granularity for Neural Paraphrase Generation
Xiaodong Gu, Zhaowei Zhang, Sang-Woo Lee, Kang Min Yoo, Jung-Woo Ha

TL;DR
This paper introduces a novel continuous granularity decomposition method for neural paraphrase generation, enhancing Transformer models by explicitly modeling hierarchical sentence structures through a specialized attention mechanism.
Contribution
It proposes a granularity-aware attention mechanism that infers hierarchical sentence structure and encodes granularity into Transformer-based paraphrase models, achieving state-of-the-art results.
Findings
Outperforms baseline models on Quora and Twitter datasets
Effectively captures fine-grained hierarchical information
Achieves state-of-the-art metrics in paraphrase generation
Abstract
While Transformers have had significant success in paragraph generation, they treat sentences as linear sequences of tokens and often neglect their hierarchical information. Prior work has shown that decomposing the levels of granularity~(e.g., word, phrase, or sentence) for input tokens has produced substantial improvements, suggesting the possibility of enhancing Transformers via more fine-grained modeling of granularity. In this work, we propose a continuous decomposition of granularity for neural paraphrase generation (C-DNPG). In order to efficiently incorporate granularity into sentence encoding, C-DNPG introduces a granularity-aware attention (GA-Attention) mechanism which extends the multi-head self-attention with: 1) a granularity head that automatically infers the hierarchical structure of a sentence by neurally estimating the granularity level of each input token; and 2) two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
