Delexicalized Paraphrase Generation
Boya Yu, Konstantine Arkoudas, Wael Hamza

TL;DR
This paper introduces a neural model for delexicalized paraphrase generation that improves semantic understanding and data augmentation for NLU tasks by generating high-quality paraphrases using CNNs and pointer mechanisms.
Contribution
The paper proposes a novel neural approach for delexicalized paraphrasing that leverages CNNs and pointers, enhancing paraphrase quality and utility for NLU data augmentation.
Findings
Achieved a 1.29% increase in exact match accuracy on live utterances.
Generated high-quality paraphrases that improve NLU task performance.
Demonstrated the effectiveness of the model in semantic preservation and data augmentation.
Abstract
We present a neural model for paraphrasing and train it to generate delexicalized sentences. We achieve this by creating training data in which each input is paired with a number of reference paraphrases. These sets of reference paraphrases represent a weak type of semantic equivalence based on annotated slots and intents. To understand semantics from different types of slots, other than anonymizing slots, we apply convolutional neural networks (CNN) prior to pooling on slot values and use pointers to locate slots in the output. We show empirically that the generated paraphrases are of high quality, leading to an additional 1.29% exact match on live utterances. We also show that natural language understanding (NLU) tasks, such as intent classification and named entity recognition, can benefit from data augmentation using automatically generated paraphrases.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
