Assessing Large Language Models in Comprehending and Verifying Concurrent Programs across Memory Models
Ridhi Jain, Rahul Purandare

TL;DR
This paper evaluates large language models' ability to understand and verify concurrency issues in programs across different memory models, highlighting their strengths and limitations in handling complex relaxed memory scenarios.
Contribution
It provides a comprehensive assessment of leading LLMs' performance in analyzing concurrent programs under various memory models, including relaxed ones, which is a novel focus.
Findings
GPT-4 and similar models effectively identify concurrency issues under sequential consistency.
All models struggle to verify correctness under relaxed memory models.
Models show limitations in understanding memory ordering constraints in complex scenarios.
Abstract
As concurrent programming becomes increasingly prevalent, effectively identifying and addressing concurrency issues such as data races and deadlocks is critical. This study evaluates the performance of several leading large language models (LLMs), including GPT-3.5-turbo, GPT-4, GPT-4o, GPT-4o-mini, and Mistral-AI's Large2, in understanding and analyzing concurrency issues within software programs. Given that relaxed memory models, such as Total Store Order (TSO) and Partial Store Order (PSO), are widely implemented and adapted in modern systems, supported even by commodity architectures like ARM and x86, our evaluation focuses not only on sequentially consistent memory models but also on these relaxed memory models. Specifically, we assess two main aspects: the models' capacity to detect concurrency problems under a sequentially consistent memory model and their ability to verify the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Software Testing and Debugging Techniques · Software Engineering Research
