Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and   Improving LLMs

Siyuan Wang; Zhongyu Wei; Yejin Choi; Xiang Ren

arXiv:2402.11442·cs.CL·June 24, 2024·1 cites

Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs

Siyuan Wang, Zhongyu Wei, Yejin Choi, Xiang Ren

PDF

Open Access 1 Repo

TL;DR

This paper investigates the reasoning capabilities of large language models by constructing a logic rule base, revealing their limitations and proposing an inference engine to improve their logical reasoning in various tasks.

Contribution

The paper introduces ULogic, a comprehensive inferential rule base for testing and enhancing LLMs' logical reasoning, and demonstrates its effectiveness in improving reasoning tasks.

Findings

01

LLMs show significant gaps in understanding complex and compositional rules.

02

The inference engine can generate accurate, complex, and abstract conclusions.

03

Using the rule base improves LLMs' performance on reasoning tasks.

Abstract

Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks. However, their mastery of underlying inferential rules still falls short of human capabilities. To investigate this, we propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic, comprising both primitive and compositional rules across five domains. Our analysis of GPT-series models over a rule subset reveals significant gaps in LLMs' logic understanding compared to human performance, especially in compositional and structural complex rules with certain bias patterns. We further distill these rules into a smaller-scale inference engine for flexible rule generation and enhancing downstream reasoning. Through a multi-judger evaluation, our inference engine proves effective in generating accurate, complex and abstract…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

siyuanwangw/ulogic
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLegal Education and Practice Innovations · Artificial Intelligence in Law · Law, Economics, and Judicial Systems