Majority Voting for Code Generation
Tim Launer, Jonas H\"ubotter, Marco Bagatella, Ido Hakimi, Andreas Krause

TL;DR
This paper introduces Functional Majority Voting (FMV), a method that uses runtime execution signatures to select the best code generation from multiple outputs, improving performance with minimal additional computation.
Contribution
The paper presents FMV as a novel test-time inference strategy for code generation and extends it to label-free Test-Time Reinforcement Learning, demonstrating its effectiveness.
Findings
FMV significantly boosts code generation performance on LiveCodeBench.
Applying functional consensus as an aggregation strategy increases pass@1 on holdout tasks.
No evidence of self-improvement beyond the model's performance ceiling was observed.
Abstract
We investigate Functional Majority Voting (FMV), a method based on functional consensus for code generation with Large Language Models, which identifies a representative solution from multiple generations using their runtime execution signatures on test inputs. We find that FMV is an effective test-time inference strategy, substantially boosting performance on LiveCodeBench without a large compute overhead. Furthermore, we extend the utility of functional consensus and apply it as an aggregation strategy for label-free Test-Time Reinforcement Learning. We demonstrate that this increases pass@1 on holdout tasks, but find no evidence of self-improvement beyond the base model's performance ceiling.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
