LLM4VV: Evaluating Cutting-Edge LLMs for Generation and Evaluation of Directive-Based Parallel Programming Model Compiler Tests

Zachariah Sollenberger; Rahul Patel; Saieda Ali Zada; Sunita Chandrasekaran

arXiv:2507.21447·cs.SE·August 20, 2025

LLM4VV: Evaluating Cutting-Edge LLMs for Generation and Evaluation of Directive-Based Parallel Programming Model Compiler Tests

Zachariah Sollenberger, Rahul Patel, Saieda Ali Zada, Sunita Chandrasekaran

PDF

TL;DR

This paper evaluates the effectiveness of large language models in generating and verifying compiler tests for parallel programming, proposing a dual-LLM system to enhance correctness and trustworthiness.

Contribution

It introduces a dual-LLM approach combining generative and discriminative models for autonomous compiler test generation and verification.

Findings

01

LLMs can generate high-quality compiler tests

02

The dual-LLM system improves verification accuracy

03

LLMs show promise in automating compiler testing processes

Abstract

The usage of Large Language Models (LLMs) for software and test development has continued to increase since LLMs were first introduced, but only recently have the expectations of LLMs become more realistic. Verifying the correctness of code generated by LLMs is key to improving their usefulness, but there have been no comprehensive and fully autonomous solutions developed yet. Hallucinations are a major concern when LLMs are applied blindly to problems without taking the time and effort to verify their outputs, and an inability to explain the logical reasoning of LLMs leads to issues with trusting their results. To address these challenges while also aiming to effectively apply LLMs, this paper proposes a dual-LLM system (i.e. a generative LLM and a discriminative LLM) and experiments with the usage of LLMs for the generation of a large volume of compiler tests. We experimented with a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.