Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs
Siyuan Wang, Zhongyu Wei, Yejin Choi, Xiang Ren

TL;DR
This paper investigates the reasoning capabilities of large language models by constructing a logic rule base, revealing their limitations and proposing an inference engine to improve their logical reasoning in various tasks.
Contribution
The paper introduces ULogic, a comprehensive inferential rule base for testing and enhancing LLMs' logical reasoning, and demonstrates its effectiveness in improving reasoning tasks.
Findings
LLMs show significant gaps in understanding complex and compositional rules.
The inference engine can generate accurate, complex, and abstract conclusions.
Using the rule base improves LLMs' performance on reasoning tasks.
Abstract
Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks. However, their mastery of underlying inferential rules still falls short of human capabilities. To investigate this, we propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic, comprising both primitive and compositional rules across five domains. Our analysis of GPT-series models over a rule subset reveals significant gaps in LLMs' logic understanding compared to human performance, especially in compositional and structural complex rules with certain bias patterns. We further distill these rules into a smaller-scale inference engine for flexible rule generation and enhancing downstream reasoning. Through a multi-judger evaluation, our inference engine proves effective in generating accurate, complex and abstract…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLegal Education and Practice Innovations · Artificial Intelligence in Law · Law, Economics, and Judicial Systems
