Watermarking Conditional Text Generation for AI Detection: Unveiling Challenges and a Semantic-Aware Watermark Remedy
Yu Fu, Deyi Xiong, Yue Dong

TL;DR
This paper identifies challenges in watermarking for AI detection in conditional text generation and proposes a semantic-aware watermarking method that improves performance across multiple models and tasks.
Contribution
Introduces a semantic-aware watermarking algorithm that enhances text generation quality while maintaining effective AI detection capabilities.
Findings
Significant performance improvement in BART and Flan-T5 models
Effective in summarization and data-to-text tasks
Maintains detection accuracy with minimal impact on perplexity
Abstract
To mitigate potential risks associated with language models, recent AI detection research proposes incorporating watermarks into machine-generated text through random vocabulary restrictions and utilizing this information for detection. While these watermarks only induce a slight deterioration in perplexity, our empirical investigation reveals a significant detriment to the performance of conditional text generation. To address this issue, we introduce a simple yet effective semantic-aware watermarking algorithm that considers the characteristics of conditional text generation and the input context. Experimental results demonstrate that our proposed method yields substantial improvements across various text generation models, including BART and Flan-T5, in tasks such as summarization and data-to-text generation while maintaining detection ability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Topic Modeling · Natural Language Processing Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Softmax · Layer Normalization · Dense Connections · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Adam
