Semantic Differentiation for Tackling Challenges in Watermarking Low-Entropy Constrained Generation Outputs

Nghia T. Le; Alan Ritter; Kartik Goyal

arXiv:2601.11629·cs.CR·January 21, 2026

Semantic Differentiation for Tackling Challenges in Watermarking Low-Entropy Constrained Generation Outputs

Nghia T. Le, Alan Ritter, Kartik Goyal

PDF

Open Access 4 Reviews

TL;DR

This paper introduces SeqMark, a sequence-level watermarking algorithm that enhances watermark detection in low-entropy constrained language model outputs by semantic differentiation, addressing limitations of existing token-level methods.

Contribution

SeqMark is a novel sequence-level watermarking approach that mitigates region collapse and balances output quality, detectability, and imperceptibility in constrained generation tasks.

Findings

01

Up to 28% increase in watermark detection F1 score.

02

Improved output quality in constrained tasks like translation and summarization.

03

Effectiveness demonstrated across multiple generation tasks.

Abstract

We demonstrate that while the current approaches for language model watermarking are effective for open-ended generation, they are inadequate at watermarking LM outputs for constrained generation tasks with low-entropy output spaces. Therefore, we devise SeqMark, a sequence-level watermarking algorithm with semantic differentiation that balances the output quality, watermark detectability, and imperceptibility. It improves on the shortcomings of the prevalent token-level watermarking algorithms that cause under-utilization of the sequence-level entropy available for constrained generation tasks. Moreover, we identify and improve upon a different failure mode we term region collapse, associated with prior sequence-level watermarking algorithms. This occurs because the pseudorandom partitioning of semantic space for watermarking in these approaches causes all high-probability outputs to…

Peer Reviews

Decision·ICLR 2026 Conference Withdrawn Submission

Reviewer 01Rating 4Confidence 4

Strengths

1. The paper effectively identifies and tackles the poor watermark performance under low-entropy constrained generation by analyzing sequence entropy in the continuous embedding space. It also introduces the novel region collapse phenomenon and proposes a principled design to mitigate it. 2. The use of LSH partitioning on a transformed high-quality subspace is conceptually elegant and ensures the watermark remains semantically hidden while maintaining output naturalness. 3. Written perspective i

Weaknesses

1. Mean-centering does not fundamentally alter angular relationships in the embedding space; it merely shifts the cluster mean. Thus, the claim that LSH over the transformed subspace eliminates collapse is not theoretically substantiated. The increase in region entropy could be stochastic, not a structural guarantee. 2. The mean-centering transformation is purely heuristic. The paper provides no theoretical justification or citation for why subtracting the sample mean should effectively decorrel

Reviewer 02Rating 4Confidence 3

Strengths

The paper addresses the hard and important task of watermarking for LLMs in tasks with low-entropy generations. The paper provides intuitive explanations of the failure mode of existing semantic watermarking methods. Empirical results demonstrate the effectiveness of the proposed method

Weaknesses

1. The paper would benefit from providing an algorithm with detailed descriptions of the full procedure. The proposed method uses locality sensitive hashing (LSH) to partition the embedding space, which is the same as some previous work. But it will still be beneficial to introduce how the partition works. The reject sampling generation and detection for red/green list type methods are relatively standard, but it would still be good to provide the detailed procedure, especially considering the g

Reviewer 03Rating 2Confidence 4

Strengths

- The identification of "region collapse" is a novel and insightful critique of existing semantic watermarking methods.The empirical evidence in Table 1 (showing low region entropy for baselines) strongly supports this claim. - The presentation is good and clear.

Weaknesses

1. The core premise of SeqMark is to "isolate the high-quality output subspace". However, the paper implements this by "sampling *n* high likelihood responses... under low temperature". The authors incorrectly and fatally equate **high probability** with **high quality**. This assumption is flawed. The method only ensures that *high-probability* samples are spread across partitions, not necessarily *high-quality* ones. This fails to solve the paper's stated problem: ensuring *high-quality* opti

Reviewer 04Rating 4Confidence 4

Strengths

1. The “region collapse” concept is well-motivated and contrasted with LSH/k-means behavior in SemStamp/k-SemStamp, including a helpful schematic and an explanation of why constrained tasks exacerbate it. 2. The mean-centering transform is elegant, cheap, and shown to spread high-quality candidates across partitions, increasing “region entropy” and reducing cosine similarity among candidates 3. Evaluations cover sentence translation, paragraph translation, summarization, and open-ended generat

Weaknesses

1. Results are mainly point metrics (P/R/F1). ROC/PR curves, confidence intervals, and explicit operating thresholds across tasks would give a clearer picture of false-positive trade-offs, especially for human vs. non-watermarked-LM negatives. 2. This paper focuses on the entropy and the semantic watermark. However, the discussion of related work is limited. More related works should be discussed. 3. How sensitive is SeqMark to the encoder family (domain/language mismatch), and to alternative

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Advanced Malware Detection Techniques