Zero-Shot Detection of LLM-Generated Code via Approximated Task Conditioning

Maor Ashkenazi; Ofir Brenner; Tal Furman Shohet; Eran Treister

arXiv:2506.06069·cs.CL·June 9, 2025

Zero-Shot Detection of LLM-Generated Code via Approximated Task Conditioning

Maor Ashkenazi, Ofir Brenner, Tal Furman Shohet, Eran Treister

PDF

Open Access

TL;DR

This paper introduces a zero-shot method for detecting LLM-generated code by approximating task conditioning, which outperforms previous approaches and works across multiple programming languages without needing access to the original generator or prompts.

Contribution

The paper proposes a novel zero-shot detection technique that approximates task conditioning to identify LLM-generated code, achieving state-of-the-art results across various benchmarks and languages.

Findings

01

State-of-the-art detection accuracy achieved

02

Method generalizes across Python, C++, Java

03

Does not require access to original prompts or generator

Abstract

Detecting Large Language Model (LLM)-generated code is a growing challenge with implications for security, intellectual property, and academic integrity. We investigate the role of conditional probability distributions in improving zero-shot LLM-generated code detection, when considering both the code and the corresponding task prompt that generated it. Our key insight is that when evaluating the probability distribution of code tokens using an LLM, there is little difference between LLM-generated and human-written code. However, conditioning on the task reveals notable differences. This contrasts with natural language text, where differences exist even in the unconditional distributions. Leveraging this, we propose a novel zero-shot detection approach that approximates the original task used to generate a given code snippet and then evaluates token-level entropy under the approximated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Authorship Attribution and Profiling