Distributional Energy-Based Models for Uncertainty-Aware Structured LLM Reasoning

Shireen Kudukkil Manchingal; Abhey Kalia; Fernanda Gon\c{c}alves; Shebin Rawther

arXiv:2605.18871·cs.LG·May 20, 2026

Distributional Energy-Based Models for Uncertainty-Aware Structured LLM Reasoning

Shireen Kudukkil Manchingal, Abhey Kalia, Fernanda Gon\c{c}alves, Shebin Rawther

PDF

TL;DR

This paper introduces a distributional energy-based verification method for structured LLM outputs, improving accuracy and constraint adherence across multiple benchmarks by combining learned quality scoring with analytical constraints.

Contribution

It presents a novel decomposed energy function with a heterogeneous ensemble verifier that enhances structured reasoning verification and outperforms large open-generation models.

Findings

01

Outperforms single-shot Qwen-72B on all benchmarks

02

Reduces constraint violations by 53% on TravelPlanner

03

Achieves 93.9% accuracy on GSM8K without prior math training

Abstract

When Large Language Models produce structured outputs such as travel plans, code solutions, or multi-step proofs, individual reasoning steps may appear correct while the output as a whole violates budgets, fails test cases, or contradicts earlier deductions. We propose a decomposed energy function that combines a learned quality scorer with deterministic analytical constraint penalties for verifying structured LLM outputs. The quality scorer is a heterogeneous ensemble of low-rank adapters on a single frozen encoder (3% trainable parameters); the ensemble mean ranks candidates while the standard deviation quantifies epistemic uncertainty, driving a two-pass inference loop that triggers targeted regeneration or abstention. Across five benchmarks (GSM8K, MuSR, TravelPlanner, TACO, Knights & Knaves), our 149M-parameter verifier orchestrating a pool of 7-26B open generators outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.