Linguistic Steganalysis via LLMs: Two Modes for Efficient Detection of Strongly Concealed Stego
Yifan Tang, Yihao Wang, Ru Zhang, Jianyi Liu

TL;DR
This paper introduces LSGC, a novel linguistic steganalysis method using large language models with two modes—generation and classification—to effectively detect strongly concealed stegos, especially those generated by LLMs, achieving state-of-the-art results.
Contribution
The paper proposes a dual-mode LSGC framework that leverages LLMs for efficient detection of concealed stegos, significantly improving accuracy and reducing training time.
Findings
LSGC achieves state-of-the-art detection performance.
Classification mode reduces training time while maintaining high accuracy.
Generation mode effectively explains whether texts are stegos.
Abstract
To detect stego (steganographic text) in complex scenarios, linguistic steganalysis (LS) with various motivations has been proposed and achieved excellent performance. However, with the development of generative steganography, some stegos have strong concealment, especially after the emergence of LLMs-based steganography, the existing LS has low detection or cannot detect them. We designed a novel LS with two modes called LSGC. In the generation mode, we created an LS-task "description" and used the generation ability of LLM to explain whether texts to be detected are stegos. On this basis, we rethought the principle of LS and LLMs, and proposed the classification mode. In this mode, LSGC deleted the LS-task "description" and used the "causalLM" LLMs to extract steganographic features. The LS features can be extracted by only one pass of the model, and a linear layer with initialization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInternet Traffic Analysis and Secure E-voting · Hate Speech and Cyberbullying Detection · Legal Language and Interpretation
MethodsLinear Layer
