Synthetic Dataset for Evaluating Complex Compositional Knowledge for Natural Language Inference
Sushma Anand Akoju, Robert Vacareanu, Haris Riaz, Eduardo Blanco, Mihai Surdeanu

TL;DR
This paper introduces SICCK, a synthetic dataset designed to evaluate NLI models' understanding of complex compositional logic, revealing significant challenges in current models' ability to handle negation and quantifiers.
Contribution
The paper presents a novel synthetic dataset, SICCK, specifically crafted to test NLI models' grasp of complex compositional knowledge and logic-based modifications.
Findings
Zero-shot NLI models perform poorly on modified sentences.
Fine-tuning does not significantly improve model performance on negation and quantifiers.
Models struggle with understanding complex logical modifications in natural language.
Abstract
We introduce a synthetic dataset called Sentences Involving Complex Compositional Knowledge (SICCK) and a novel analysis that investigates the performance of Natural Language Inference (NLI) models to understand compositionality in logic. We produce 1,304 sentence pairs by modifying 15 examples from the SICK dataset (Marelli et al., 2014). To this end, we modify the original texts using a set of phrases - modifiers that correspond to universal quantifiers, existential quantifiers, negation, and other concept modifiers in Natural Logic (NL) (MacCartney, 2009). We use these phrases to modify the subject, verb, and object parts of the premise and hypothesis. Lastly, we annotate these modified texts with the corresponding entailment labels following NL rules. We conduct a preliminary verification of how well the change in the structural and semantic composition is captured by neural NLI…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
