Linguistic Steganalysis via LLMs: Two Modes for Efficient Detection of   Strongly Concealed Stego

Yifan Tang; Yihao Wang; Ru Zhang; Jianyi Liu

arXiv:2406.04218·cs.CL·June 24, 2024

Linguistic Steganalysis via LLMs: Two Modes for Efficient Detection of Strongly Concealed Stego

Yifan Tang, Yihao Wang, Ru Zhang, Jianyi Liu

PDF

Open Access

TL;DR

This paper introduces LSGC, a novel linguistic steganalysis method using large language models with two modes—generation and classification—to effectively detect strongly concealed stegos, especially those generated by LLMs, achieving state-of-the-art results.

Contribution

The paper proposes a dual-mode LSGC framework that leverages LLMs for efficient detection of concealed stegos, significantly improving accuracy and reducing training time.

Findings

01

LSGC achieves state-of-the-art detection performance.

02

Classification mode reduces training time while maintaining high accuracy.

03

Generation mode effectively explains whether texts are stegos.

Abstract

To detect stego (steganographic text) in complex scenarios, linguistic steganalysis (LS) with various motivations has been proposed and achieved excellent performance. However, with the development of generative steganography, some stegos have strong concealment, especially after the emergence of LLMs-based steganography, the existing LS has low detection or cannot detect them. We designed a novel LS with two modes called LSGC. In the generation mode, we created an LS-task "description" and used the generation ability of LLM to explain whether texts to be detected are stegos. On this basis, we rethought the principle of LS and LLMs, and proposed the classification mode. In this mode, LSGC deleted the LS-task "description" and used the "causalLM" LLMs to extract steganographic features. The LS features can be extracted by only one pass of the model, and a linear layer with initialization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInternet Traffic Analysis and Secure E-voting · Hate Speech and Cyberbullying Detection · Legal Language and Interpretation

MethodsLinear Layer