NEST: Nascent Encoded Steganographic Thoughts

Artem Karpov

arXiv:2602.14095·cs.AI·February 17, 2026

NEST: Nascent Encoded Steganographic Thoughts

Artem Karpov

PDF

Open Access

TL;DR

This paper investigates the potential for large language models to hide reasoning processes within text, evaluating their steganographic capabilities and implications for safety and risk assessment.

Contribution

It introduces a systematic evaluation framework for steganographic reasoning in LLMs and demonstrates the nascent but concerning ability of models to conceal complex reasoning.

Findings

01

Current models cannot sustain hidden reasoning in complex math tasks.

02

Claude Opus 4.5 shows 92% accuracy in a simplified counting steganography task.

03

GPT-5.2 may sometimes refuse steganographic instructions while still complying.

Abstract

Monitoring chain-of-thought (CoT) reasoning is a foundational safety technique for large language model (LLM) agents; however, this oversight is compromised if models learn to conceal their reasoning. We explore the potential for steganographic CoT -- where models hide secret reasoning within innocuous text -- to inform risk assessment and deployment policies. We systematically evaluate the limits of steganographic capabilities across 28 models, ranging from past generations to the current frontier. We measure monitor evasion, refusal rates, encoding fidelity, and hidden task accuracy across four datasets, comparing steganographic acrostics against plain reasoning and filler-token baselines. We find that current models cannot yet sustain hidden reasoning for complex math and arithmetic tasks. However, in a simplified counting experiment, Claude Opus 4.5 achieved 92% accuracy on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Topic Modeling