SNEAK: Evaluating Strategic Communication and Information Leakage in Large Language Models
Adar Avsian, Larry Heck

TL;DR
This paper introduces SNEAK, a benchmark for assessing how well large language models can communicate selectively, balancing informativeness and secrecy in multi-agent scenarios.
Contribution
The paper presents SNEAK, a novel benchmark for evaluating strategic communication and information leakage in language models, addressing a gap in existing evaluation methods.
Findings
Humans outperform all evaluated models by a large margin.
Current models struggle to balance informativeness and secrecy effectively.
Strategic communication remains a challenging capability for modern language models.
Abstract
Large language models (LLMs) are increasingly deployed in multi-agent settings where communication must balance informativeness and secrecy. In such settings, an agent may need to signal information to collaborators while preventing an adversary from inferring sensitive details. However, existing LLM benchmarks primarily evaluate capabilities such as reasoning, factual knowledge, or instruction following, and do not directly measure strategic communication under asymmetric information. We introduce SNEAK (Secret-aware Natural language Evaluation for Adversarial Knowledge), a benchmark for evaluating selective information sharing in language models. In SNEAK, a model is given a semantic category, a candidate set of words, and a secret word, and must generate a message that indicates knowledge of the secret without revealing it too clearly. We evaluate generated messages using two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
