DeCon: Detecting Incorrect Assertions via Postconditions Generated by a   Large Language Model

Hao Yu; Tianyu Chen; Jiaming Huang; Zongyang Li; Dezhi Ran; Xinyu; Wang; Ying Li; Assaf Marron; David Harel; Yuan Xie; Tao Xie

arXiv:2501.02901·cs.SE·January 7, 2025

DeCon: Detecting Incorrect Assertions via Postconditions Generated by a Large Language Model

Hao Yu, Tianyu Chen, Jiaming Huang, Zongyang Li, Dezhi Ran, Xinyu, Wang, Ying Li, Assaf Marron, David Harel, Yuan Xie, Tao Xie

PDF

Open Access

TL;DR

DeCon is a novel approach that detects incorrect assertions generated by large language models in code, using postconditions and a small set of I/O examples, significantly improving assertion correctness detection.

Contribution

DeCon introduces a new method leveraging LLM-generated postconditions and I/O examples to effectively identify incorrect assertions in code generated by LLMs.

Findings

01

Detects over 64% of incorrect assertions

02

Improves code generation effectiveness by 4% Pass@1

03

Maintains high fault-finding ability despite filtering

Abstract

Recently, given the docstring for the target problem and the target function signature, large language models (LLMs) have been used not only to generate source code, but also to generate test cases, consisting of test inputs and assertions (e.g., in the form of checking an actual output against the expected output). However, as shown by our empirical study on assertions generated by four LLMs for the HumanEval benchmark, over 62% of the generated assertions are incorrect (i.e., failed on the ground-truth problem solution). To detect incorrect assertions (given the docstring and the target function signature along with a sample of example inputs and outputs), in this paper, we propose a new approach named DeCon to effectively detect incorrect assertions via LLM-generated postconditions for the target problem (a postcondition is a predicate that must always be true just after the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research