Semantic Voting: Execution-Grounded Consensus for LLM Code Generation

Shan Jiang; Zijian Yi; Chenguang Zhu

arXiv:2605.08680·cs.SE·May 12, 2026

Semantic Voting: Execution-Grounded Consensus for LLM Code Generation

Shan Jiang, Zijian Yi, Chenguang Zhu

PDF

TL;DR

This paper compares various execution-grounded consensus methods for LLM code generation, showing execution-based selectors outperform majority voting, with input quality and thinking level influencing effectiveness.

Contribution

It introduces SemanticVote, a clustering-based execution fingerprint method, and provides a comprehensive analysis of 18 configurations across models and benchmarks.

Findings

01

Execution-based selectors outperform majority voting by 19-52 percentage points.

02

Input quality significantly impacts selection effectiveness, with sketch-based inputs outperforming direct LLM generation.

03

Deeper thinking improves majority voting but not execution-based methods, which are more sensitive to candidate diversity.

Abstract

LLM code-generation pipelines often sample multiple candidates and select one final answer without access to a complete oracle. Existing pipelines mix textual voting, ranking, and execution-based agreement, but the relative contribution of each component remains unclear. We study 18 configurations across different models, thinking levels, and benchmarks, comparing output-pattern majority voting, weighted voting, MBR-Exec, and SemanticVote - a method that clusters candidates by execution fingerprints on LLM-generated inputs. Three findings emerge. (1) The best execution-based selector exceeds output-pattern majority voting by 19-52 percentage points on every configuration, with every execution-based selector exceeding it by at least 18 points. (2) Once candidates are executed on diverse inputs, aggregation rule has limited effect: SemanticVote, weighted voting, and MBR-Exec are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.