DuET: Dual Execution for Test Output Prediction with Generated Code and Pseudocode
Hojae Han, Jaejin Kim, Seung-won Hwang, Yu Jin Kim, Moontae Lee

TL;DR
DuET is a dual-execution framework that combines code execution and pseudocode reasoning via LLMs to improve test output prediction accuracy, achieving state-of-the-art results.
Contribution
It introduces a novel dual-execution approach that leverages both code and pseudocode grounding, enhancing reliability in test output prediction.
Findings
DuET improves Pass@1 by 13.6 percentage points on LiveCodeBench.
Combining code execution and pseudocode reasoning addresses limitations of each approach.
Dual execution with majority voting outperforms individual methods.
Abstract
This work addresses test output prediction, a key challenge in test case generation. To improve the reliability of predicted outputs by LLMs, prior approaches generate code first to ground predictions. One grounding strategy is direct execution of generated code, but even minor errors can cause failures. To address this, we introduce LLM-based pseudocode execution, which grounds prediction on more error-resilient pseudocode and simulates execution via LLM reasoning. We further propose DuET, a dual-execution framework that combines both approaches by functional majority voting. Our analysis shows the two approaches are complementary in overcoming the limitations of direct execution suffering from code errors, and pseudocode reasoning from hallucination. On LiveCodeBench, DuET achieves the state-of-the-art performance, improving Pass@1 by 13.6 pp.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
