MATRIX: Multi-Layer Code Watermarking via Dual-Channel Constrained Parity-Check Encoding
Yuqing Nie, Chong Wang, Guosheng Xu, Guoai Xu, Chenyu Wang, Haoyu Wang, Kailong Wang

TL;DR
MATRIX is a novel multi-layer code watermarking framework that enhances robustness, coverage, and interpretability for code provenance, using dual-channel encoding and error-correction techniques.
Contribution
It introduces a dual-channel watermarking scheme with constrained parity-check encoding, improving robustness and applicability over existing single-layer methods.
Findings
Achieves 99.20% watermark detection accuracy on Python code.
Maintains minimal code functionality loss (0-0.14%).
Improves robustness by up to 26.67% against attacks.
Abstract
Code Large Language Models (Code LLMs) have revolutionized software development but raised critical concerns regarding code provenance, copyright protection, and security. Existing code watermarking approaches suffer from two fundamental limitations: black-box methods either exhibit detectable syntactic patterns vulnerable to statistical analysis or rely on implicit neural embedding behaviors that weaken interpretability, auditability, and precise control, while white-box methods lack code-aware capabilities that may compromise functionality. Moreover, current single-layer watermarking schemes fail to address increasingly complex provenance requirements such as multi-level attribution and version tracking. We present MATRIX, a novel code watermarking framework that formulates watermark encoding as solving constrained parity-check matrix equations. MATRIX employs dual-channel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
