On Evaluating The Performance of Watermarked Machine-Generated Texts   Under Adversarial Attacks

Zesen Liu; Tianshuo Cong; Xinlei He; Qi Li

arXiv:2407.04794·cs.CR·December 2, 2024

On Evaluating The Performance of Watermarked Machine-Generated Texts Under Adversarial Attacks

Zesen Liu, Tianshuo Cong, Xinlei He, Qi Li

PDF

Open Access

TL;DR

This paper systematically evaluates the robustness of various watermarking schemes for machine-generated texts against different attacks, revealing vulnerabilities and emphasizing the need for more resilient solutions.

Contribution

It categorizes watermarking schemes and attacks, conducts extensive experiments, and provides insights into their robustness and imperceptibility, highlighting areas for improvement.

Findings

01

Post-text attacks are more effective than pre-text attacks.

02

Pre-text watermarks are more imperceptible and maintain text quality.

03

Current watermarking schemes are vulnerable to combined attacks.

Abstract

Large Language Models (LLMs) excel in various applications, including text generation and complex tasks. However, the misuse of LLMs raises concerns about the authenticity and ethical implications of the content they produce, such as deepfake news, academic fraud, and copyright infringement. Watermarking techniques, which embed identifiable markers in machine-generated text, offer a promising solution to these issues by allowing for content verification and origin tracing. Unfortunately, the robustness of current LLM watermarking schemes under potential watermark removal attacks has not been comprehensively explored. In this paper, to fill this gap, we first systematically comb the mainstream watermarking schemes and removal attacks on machine-generated texts, and then we categorize them into pre-text (before text generation) and post-text (after text generation) classes so that we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Adversarial Robustness in Machine Learning · Network Security and Intrusion Detection