Teaching Syntax by Adversarial Distraction
Juho Kim, Christopher Malon, Asim Kadav

TL;DR
This paper introduces synthetic datasets based on natural entailment examples to teach models the importance of syntax and word order, revealing that models often ignore these aspects without specific training.
Contribution
It presents new datasets designed to improve models' understanding of syntax and demonstrates the limited ability of current models to learn these aspects without retraining.
Findings
Popular entailment models ignore syntactic differences without retraining.
Retraining enables some models to better compare syntax.
Models still struggle with certain syntactic distinctions after retraining.
Abstract
Existing entailment datasets mainly pose problems which can be answered without attention to grammar or word order. Learning syntax requires comparing examples where different grammar and word order change the desired classification. We introduce several datasets based on synthetic transformations of natural entailment examples in SNLI or FEVER, to teach aspects of grammar and word order. We show that without retraining, popular entailment models are unaware that these syntactic differences change meaning. With retraining, some but not all popular entailment models can learn to compare the syntax properly.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
