Towards Logically Sound Natural Language Reasoning with Logic-Enhanced Language Model Agents

Agnieszka Mensfelt; Kostas Stathis; Vince Trencsenyi

arXiv:2408.16081·cs.AI·May 30, 2025

Towards Logically Sound Natural Language Reasoning with Logic-Enhanced Language Model Agents

Agnieszka Mensfelt, Kostas Stathis, Vince Trencsenyi

PDF

Open Access 1 Repo

TL;DR

This paper introduces LELMA, a framework that combines language models with formal logic to improve the logical soundness and correctness of natural language reasoning in open-ended tasks.

Contribution

The paper proposes a novel framework integrating LLMs with formal logic for validation and refinement, addressing logical errors in natural language reasoning.

Findings

01

LELMA achieves high accuracy in error detection.

02

Self-refinement improves reasoning correctness, especially in GPT-4o.

03

Challenges remain in autoformalization accuracy and evaluating ambiguous tasks.

Abstract

Large language models (LLMs) are increasingly explored as general-purpose reasoners, particularly in agentic contexts. However, their outputs remain prone to mathematical and logical errors. This is especially challenging in open-ended tasks, where unstructured outputs lack explicit ground truth and may contain subtle inconsistencies. To address this issue, we propose Logic-Enhanced Language Model Agents (LELMA), a framework that integrates LLMs with formal logic to enable validation and refinement of natural language reasoning. LELMA comprises three components: an LLM-Reasoner, an LLM-Translator, and a Solver, and employs autoformalization to translate reasoning into logic representations, which are then used to assess logical validity. Using game-theoretic scenarios such as the Prisoner's Dilemma as testbeds, we highlight the limitations of both less capable (Gemini 1.0 Pro) and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dicelab-rhul/lelma
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multi-Agent Systems and Negotiation · Business Process Modeling and Analysis

MethodsAttention Is All You Need · Linear Layer · Adam · Layer Normalization · Position-Wise Feed-Forward Layer · Dense Connections · Residual Connection · Multi-Head Attention · Byte Pair Encoding · Absolute Position Encodings