RobustLR: Evaluating Robustness to Logical Perturbation in Deductive Reasoning
Soumya Sanyal, Zeyi Liao, Xiang Ren

TL;DR
RobustLR introduces evaluation datasets to test the robustness of deductive reasoning models like RoBERTa and T5 against logical perturbations, revealing their vulnerabilities and highlighting areas for improvement in understanding logical semantics.
Contribution
The paper presents RobustLR, a new suite of datasets for assessing the logical robustness of language models, exposing their weaknesses in handling minimal logical edits and operators.
Findings
Models lack robustness to logical perturbations.
Difficulty in learning negation and disjunction operators.
Models do not perform consistently across different logical edits.
Abstract
Transformers have been shown to be able to perform deductive reasoning on a logical rulebase containing rules and statements written in English natural language. While the progress is promising, it is currently unclear if these models indeed perform logical reasoning by understanding the underlying logical semantics in the language. To this end, we propose RobustLR, a suite of evaluation datasets that evaluate the robustness of these models to minimal logical edits in rulebases and some standard logical equivalence conditions. In our experiments with RoBERTa and T5, we find that the models trained in prior works do not perform consistently on the different perturbations in RobustLR, thus showing that the models are not robust to the proposed logical perturbations. Further, we find that the models find it especially hard to learn logical negation and disjunction operators. Overall, using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Software Engineering Research
MethodsAttention Is All You Need · Balanced Selection · Linear Layer · Layer Normalization · Byte Pair Encoding · Weight Decay · Linear Warmup With Linear Decay · Dense Connections · Dropout · Inverse Square Root Schedule
