Are LLMs classical or nonmonotonic reasoners? Lessons from generics
Alina Leidinger, Robert van Rooij, Ekaterina Shutova

TL;DR
This paper investigates whether large language models can perform nonmonotonic reasoning, revealing that while they mimic some human reasoning patterns, they struggle with belief stability when faced with new or conflicting information.
Contribution
The study evaluates seven state-of-the-art LLMs on nonmonotonic reasoning tasks involving generics and exceptions, highlighting their limitations in maintaining stable beliefs.
Findings
LLMs exhibit some human-like nonmonotonic reasoning patterns
They fail to maintain stable beliefs with added or conflicting information
Consistent reasoning remains challenging for LLMs
Abstract
Recent scholarship on reasoning in LLMs has supplied evidence of impressive performance and flexible adaptation to machine generated or human feedback. Nonmonotonic reasoning, crucial to human cognition for navigating the real world, remains a challenging, yet understudied task. In this work, we study nonmonotonic reasoning capabilities of seven state-of-the-art LLMs in one abstract and one commonsense reasoning task featuring generics, such as 'Birds fly', and exceptions, 'Penguins don't fly' (see Fig. 1). While LLMs exhibit reasoning patterns in accordance with human nonmonotonic reasoning abilities, they fail to maintain stable beliefs on truth conditions of generics at the addition of supporting examples ('Owls fly') or unrelated information ('Lions have manes'). Our findings highlight pitfalls in attributing human reasoning behaviours to LLMs, as well as assessing general…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSemantic Web and Ontologies
