CODE ACROSTIC: Robust Watermarking for Code Generation
Li Lin, Siyuan Xin, Yang Cao, and Xiaochun Cao

TL;DR
This paper introduces a robust watermarking method for LLM-generated code that withstands comment removal attacks by leveraging entropy-based cues, improving detectability and usability over existing techniques.
Contribution
It proposes a novel watermarking approach that uses prior knowledge and entropy cues to embed detectable marks in code, addressing comment removal vulnerabilities.
Findings
Outperforms existing watermarking methods in detectability.
Resists comment removal attacks effectively.
Shows improved usability and robustness in evaluations.
Abstract
Watermarking large language models (LLMs) is vital for preventing their misuse, including the fabrication of fake news, plagiarism, and spam. It is especially important to watermark LLM-generated code, as it often contains intellectual property.However, we found that existing methods for watermarking LLM-generated code fail to address comment removal attack.In such cases, an attacker can simply remove the comments from the generated code without affecting its functionality, significantly reducing the effectiveness of current code-watermarking techniques.On the other hand, injecting a watermark into code is challenging because, as previous works have noted, most code represents a low-entropy scenario compared to natural language. Our approach to addressing this issue involves leveraging prior knowledge to distinguish between low-entropy and high-entropy parts of the code, as indicated by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Adversarial Robustness in Machine Learning · Hate Speech and Cyberbullying Detection
