Language Models can Subtly Deceive Without Lying: A Case Study on Strategic Phrasing in Legislation
Atharvan Dogra, Krishna Pillutla, Ameet Deshpande, Ananya B Sai, John Nay, Tanmay Rajpurohit, Ashwin Kalyan, Balaraman Ravindran

TL;DR
This paper investigates how large language models can subtly deceive in legislative contexts through strategic phrasing, revealing risks of undetectable manipulation that can be amplified with optimization techniques.
Contribution
It introduces a testbed for studying subtle deception by LLMs in legislative scenarios and demonstrates how strategic phrasing can evade detection, highlighting new risks of LLM misuse.
Findings
LLMs can craft subtle, undetectable deceptive language.
Optimization techniques increase deception success by up to 40%.
Human evaluations confirm coherence and intent of deceptive phrasing.
Abstract
We explore the ability of large language models (LLMs) to engage in subtle deception through strategically phrasing and intentionally manipulating information. This harmful behavior can be hard to detect, unlike blatant lying or unintentional hallucination. We build a simple testbed mimicking a legislative environment where a corporate \textit{lobbyist} module is proposing amendments to bills that benefit a specific company while evading identification of this benefactor. We use real-world legislative bills matched with potentially affected companies to ground these interactions. Our results show that LLM lobbyists can draft subtle phrasing to avoid such identification by strong LLM-based detectors. Further optimization of the phrasing using LLM-based re-planning and re-sampling increases deception rates by up to 40 percentage points. Our human evaluations to verify the quality of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Law, AI, and Intellectual Property
