Can Large Language Models Simulate Symbolic Execution Output Like KLEE?

Rong Feng; Vanisha Gupta; Vivek Patel; Viroopaksh Reddy Ernampati; Suman Saha

arXiv:2511.08530·cs.SE·November 12, 2025

Can Large Language Models Simulate Symbolic Execution Output Like KLEE?

Rong Feng, Vanisha Gupta, Vivek Patel, Viroopaksh Reddy Ernampati, Suman Saha

PDF

Open Access

TL;DR

This paper investigates whether GPT-4o can simulate KLEE's symbolic execution outputs, focusing on identifying the most constrained execution path in C programs, aiming to reduce computational costs.

Contribution

It demonstrates the potential and limitations of large language models in approximating symbolic execution tasks, a novel exploration in program analysis.

Findings

01

GPT-4o achieved about 20% accuracy in output prediction.

02

The model could partially identify the most constrained path.

03

Results highlight current LLM capabilities and challenges in symbolic execution simulation.

Abstract

Symbolic execution helps check programs by exploring different paths based on symbolic inputs. Tools like KLEE are commonly used because they can automatically detect bugs and create test cases. But one of KLEE's biggest issues is how slow it can get when programs have lots of branching paths-it often becomes too resource-heavy to run on large or complex code. In this project, we wanted to see if a large language model like GPT-4o could simulate the kinds of outputs that KLEE generates. The idea was to explore whether LLMs could one day replace parts of symbolic execution to save time and resources. One specific goal was to have GPT-4o identify the most constrained path in a program, this is the execution path with the most symbolic conditions. These paths are especially important because they often represent edge cases that are harder to test and more likely to contain deep bugs.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Web Application Security Vulnerabilities