General Framework for Reversible Data Hiding in Texts Based on Masked   Language Modeling

Xiaoyan Zheng; Yurun Fang; Hanzhou Wu

arXiv:2206.10112·cs.CR·August 4, 2022

General Framework for Reversible Data Hiding in Texts Based on Masked Language Modeling

Xiaoyan Zheng, Yurun Fang, Hanzhou Wu

PDF

Open Access

TL;DR

This paper introduces a reversible data hiding framework in texts using masked language models, enabling perfect recovery of original texts and secret data with high fluency and security, without sharing models.

Contribution

It presents a novel reversible data hiding method leveraging masked language modeling, reducing side information and enhancing security compared to prior approaches.

Findings

01

Successful embedding and extraction of secret information.

02

Marked texts maintain high fluency and semantic quality.

03

No need for shared language models reduces side information.

Abstract

With the fast development of natural language processing, recent advances in information hiding focus on covertly embedding secret information into texts. These algorithms either modify a given cover text or directly generate a text containing secret information, which, however, are not reversible, meaning that the original text not carrying secret information cannot be perfectly recovered unless much side information are shared in advance. To tackle with this problem, in this paper, we propose a general framework to embed secret information into a given cover text, for which the embedded information and the original cover text can be perfectly retrieved from the marked text. The main idea of the proposed method is to use a masked language model to generate such a marked text that the cover text can be reconstructed by collecting the words of some positions and the words of the other…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Steganography and Watermarking Techniques · Privacy-Preserving Technologies in Data · Face recognition and analysis