LLM4VV: Developing LLM-Driven Testsuite for Compiler Validation
Christian Munley, Aaron Jarmusch, Sunita Chandrasekaran

TL;DR
This paper investigates the use of various large language models to automatically generate tests for validating OpenACC compiler implementations, exploring fine-tuning and prompt engineering techniques to improve test quality.
Contribution
It introduces a comprehensive analysis of LLM capabilities for code generation in compiler testing, including fine-tuning and prompt strategies, with over 5000 generated tests evaluated.
Findings
Deepseek-Coder-33b-Instruct produced the most passing tests
GPT-4-Turbo showed strong test generation performance
Prompt engineering techniques significantly impact test quality
Abstract
Large language models (LLMs) are a new and powerful tool for a wide span of applications involving natural language and demonstrate impressive code generation abilities. The goal of this work is to automatically generate tests and use these tests to validate and verify compiler implementations of a directive-based parallel programming paradigm, OpenACC. To do so, in this paper, we explore the capabilities of state-of-the-art LLMs, including open-source LLMs -- Meta Codellama, Phind fine-tuned version of Codellama, Deepseek Deepseek Coder and closed-source LLMs -- OpenAI GPT-3.5-Turbo and GPT-4-Turbo. We further fine-tuned the open-source LLMs and GPT-3.5-Turbo using our own testsuite dataset along with using the OpenACC specification. We also explored these LLMs using various prompt engineering techniques that include code template, template with retrieval-augmented generation (RAG),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Software Engineering Research · Advanced Data Storage Technologies
Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Cosine Annealing · Linear Warmup With Cosine Annealing · GPT-3 · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Weight Decay · Linear Layer
