Towards Code Watermarking with Dual-Channel Transformations
Borui Yang, Wei Li, Liyao Xiang, Bo Li

TL;DR
This paper introduces SrcMarker, a novel source code watermarking system that uses dual-channel transformations and learning-based techniques to embed ownership identifiers into code without affecting its functionality or readability.
Contribution
The paper presents a new watermarking approach that combines AST-based transformations with learning modules, enabling language-agnostic, unobtrusive, and end-to-end trainable source code watermarking.
Findings
SrcMarker outperforms existing watermarking methods in various metrics.
The system effectively preserves code semantics and readability.
It demonstrates robustness across multiple programming languages.
Abstract
The expansion of the open source community and the rise of large language models have raised ethical and security concerns on the distribution of source code, such as misconduct on copyrighted code, distributions without proper licenses, or misuse of the code for malicious purposes. Hence it is important to track the ownership of source code, in which watermarking is a major technique. Yet, drastically different from natural languages, source code watermarking requires far stricter and more complicated rules to ensure the readability as well as the functionality of the source code. Hence we introduce SrcMarker, a watermarking system to unobtrusively encode ID bitstrings into source code, without affecting the usage and semantics of the code. To this end, SrcMarker performs transformations on an AST-based intermediate representation that enables unified transformations across different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Internet Traffic Analysis and Secure E-voting · Network Security and Intrusion Detection
