Position Paper: Programming Language Techniques for Bridging LLM Code Generation Semantic Gaps

Yalong Du; Chaozheng Wang; Huaijin Wang

arXiv:2507.09135·cs.SE·July 15, 2025

Position Paper: Programming Language Techniques for Bridging LLM Code Generation Semantic Gaps

Yalong Du, Chaozheng Wang, Huaijin Wang

PDF

TL;DR

This paper advocates for integrating programming language techniques with large language models to improve the semantic accuracy, reliability, and trustworthiness of automatically generated code.

Contribution

It proposes a structured approach combining PL techniques with LLMs to address semantic gaps and enhance code correctness and interpretability.

Findings

01

Structured program representations improve code clarity.

02

Formal guarantees increase trustworthiness of generated code.

03

Verification mechanisms reduce semantic errors.

Abstract

Large Language Models have demonstrated remarkable capabilities in automated code generation, yet their statistical nature and black-box characteristics create significant semantic gaps manifested through syntax errors, semantic hallucinations, and reliability concerns. This position paper argues that principled integration of Programming Language (PL) techniques is essential for bridging these gaps. Through structured program representations, formal correctness guarantees, and robust verification mechanisms, PL techniques can elevate LLM-generated code from statistical pattern matching to truly reliable and trustworthy levels. This integration is crucial for developing systems that generate code that is not only functionally correct but also interpretable, verifiable, and ultimately trustworthy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.