Large Language Models as Code Executors: An Exploratory Study
Chenyang Lyu, Lecheng Yan, Rui Xing, Wenxi Li, Younes Samih, Tianbo, Ji, Longyue Wang

TL;DR
This study explores the novel use of Large Language Models as code executors, evaluating their accuracy across various models and introducing an iterative prompting technique to improve performance.
Contribution
It is the first comprehensive examination of LLMs as code executors and proposes an iterative prompting method to enhance their accuracy.
Findings
o1 model achieved over 90% accuracy in code execution
Iterative Instruction Prompting improved accuracy by up to 19.46%
Different models showed varying levels of effectiveness in code execution
Abstract
The capabilities of Large Language Models (LLMs) have significantly evolved, extending from natural language processing to complex tasks like code understanding and generation. We expand the scope of LLMs' capabilities to a broader context, using LLMs to execute code snippets to obtain the output. This paper pioneers the exploration of LLMs as code executors, where code snippets are directly fed to the models for execution, and outputs are returned. We are the first to comprehensively examine this feasibility across various LLMs, including OpenAI's o1, GPT-4o, GPT-3.5, DeepSeek, and Qwen-Coder. Notably, the o1 model achieved over 90% accuracy in code execution, while others demonstrated lower accuracy levels. Furthermore, we introduce an Iterative Instruction Prompting (IIP) technique that processes code snippets line by line, enhancing the accuracy of weaker models by an average of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Linear Layer · Residual Connection · Weight Decay · Attention Is All You Need · Cosine Annealing · Dropout · Byte Pair Encoding · Softmax
