Eliminating Hallucination-Induced Errors in LLM Code Generation with Functional Clustering
Chaitanya Ravuri, Saman Amarasinghe

TL;DR
This paper introduces functional clustering, a black-box method that significantly reduces hallucination-induced errors in LLM code generation by clustering candidate outputs based on I/O behavior and using empirical confidence estimates.
Contribution
The paper presents a novel functional clustering wrapper that eliminates nearly all hallucination errors in LLM code generation while providing a tunable confidence score, applicable to closed-source models.
Findings
Reduces error rate from ~65% to 2% on LiveCodeBench
Preserves baseline pass@1 on solvable tasks
Achieves 0% error at conservative thresholds while answering 15.6% of prompts
Abstract
Modern code-generation LLMs can already solve a large fraction of programming problems, yet they still hallucinate subtle bugs that make their outputs unsafe for autonomous deployment. We present functional clustering, a black-box wrapper that eliminates nearly all hallucination-induced errors while providing a tunable confidence score. The wrapper samples many candidate programs, executes each on a self-generated test suite, and clusters candidates whose I/O behavior is identical; the empirical mass of the largest cluster serves as an exact confidence estimate. A single scalar threshold on this estimate lets users trade coverage for reliability with exponential guarantees. On LiveCodeBench our verifier preserves baseline pass@1 on solvable tasks yet slashes the error rate of returned answers from ~65% to 2%, and drives it to 0% at a conservative threshold while still answering 15.6% of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVLSI and Analog Circuit Testing · Image Processing Techniques and Applications · Cryptography and Residue Arithmetic
