VERGE: Formal Refinement and Guidance Engine for Verifiable LLM Reasoning

Vikash Singh; Darion Cassel; Nathaniel Weir; Nick Feng; Sam Bayless

arXiv:2601.20055·cs.CL·May 5, 2026

VERGE: Formal Refinement and Guidance Engine for Verifiable LLM Reasoning

Vikash Singh, Darion Cassel, Nathaniel Weir, Nick Feng, Sam Bayless

PDF

TL;DR

VERGE is a neurosymbolic framework that enhances LLM reasoning accuracy by combining formal verification, semantic routing, and iterative refinement to ensure logical correctness in high-stakes applications.

Contribution

It introduces a novel multi-model consensus, semantic routing, and precise error localization techniques to improve the logical reliability of LLM outputs.

Findings

01

Achieved an 18.7% performance uplift on reasoning benchmarks.

02

Successfully classifies claims by logical status and refines answers iteratively.

03

Combines LLMs with SMT solvers for verification-guided reasoning.

Abstract

Despite the syntactic fluency of Large Language Models (LLMs), ensuring their logical correctness in high-stakes domains remains a fundamental challenge. We present a neurosymbolic framework that combines LLMs with SMT solvers to produce verification-guided answers through iterative refinement. Our approach decomposes LLM outputs into atomic claims, autoformalizes them into first-order logic, and verifies their logical consistency using automated theorem proving. We introduce three key innovations: (1) multi-model consensus via formal semantic equivalence checking to ensure logic-level alignment between candidates, eliminating the syntactic bias of surface-form metrics, (2) semantic routing that directs different claim types to appropriate verification strategies: symbolic solvers for logical claims and LLM ensembles for commonsense reasoning, and (3) precise logical error localization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.