From Logic to Language: A Trust Index for Problem Solving with LLMs
Tehseen Rug, Felix B\"ohmer, Tessa Pfattheicher

TL;DR
This paper proposes a unified framework and a trust index to evaluate problem-solving in LLMs, distinguishing formal and natural language solutions and capturing their quality nuances.
Contribution
It introduces a novel trust index and statistical quality dimensions to better assess the diverse problem-solving capabilities of LLMs compared to classical methods.
Findings
The trust index Q differentiates formal correctness from natural language adequacy.
Bi-semantic entropy measures robustness and diversity in LLM answers.
Emotional valence quantifies subjective solution valuation.
Abstract
Classical computation, grounded in formal, logical systems, has been the engine of technological progress for decades, excelling at problems that can be described with unambiguous rules. This paradigm, however, leaves a vast ocean of human problems -- those characterized by ambiguity, dynamic environments, and subjective context -- largely untouched. The advent of Large Language Models (LLMs) represents a fundamental shift, enabling computational systems to engage with this previously inaccessible domain using natural language. This paper introduces a unified framework to understand and contrast these problem-solving paradigms. We define and delineate the problem spaces addressable by formal languages versus natural language. While solutions to the former problem class can be evaluated using binary quality measures, the latter requires a much more nuanced definition of approximate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies
