Semantic Faithfulness and Entropy Production Measures to Tame Your LLM Demons and Manage Hallucinations
Igor Halperin

TL;DR
This paper introduces two unsupervised, information-theoretic metrics for evaluating the faithfulness of LLM outputs and managing hallucinations, based on thermodynamics and entropy concepts, demonstrated on SEC 10-K summarization.
Contribution
It proposes novel unsupervised metrics for LLM faithfulness and hallucination control using thermodynamics and information theory, modeling LLMs as bipartite information engines.
Findings
The semantic faithfulness metric correlates with reduced hallucinations.
High faithfulness scores are associated with low entropy production.
Framework effectively evaluates LLM summarization of SEC filings.
Abstract
Evaluating faithfulness of Large Language Models (LLMs) to a given task is a complex challenge. We propose two new unsupervised metrics for faithfulness evaluation using insights from information theory and thermodynamics. Our approach treats an LLM as a bipartite information engine where hidden layers act as a Maxwell demon controlling transformations of context into answer via prompt . We model Question-Context-Answer (QCA) triplets as probability distributions over shared topics. Topic transformations from to and are modeled as transition matrices and encoding the query goal and actual result, respectively. Our semantic faithfulness (SF) metric quantifies faithfulness for any given QCA triplet by the Kullback-Leibler (KL) divergence between these matrices. Both matrices are inferred simultaneously via convex optimization of this KL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text Readability and Simplification · Natural Language Processing Techniques
