Managing Escalation in Off-the-Shelf Large Language Models

Sebastian Elbaum; Jonathan Panter

arXiv:2508.01056·cs.ET·August 6, 2025

Managing Escalation in Off-the-Shelf Large Language Models

Sebastian Elbaum, Jonathan Panter

PDF

TL;DR

This paper demonstrates simple interventions to control escalation tendencies in off-the-shelf large language models, showing they can be effectively managed for national security applications rather than avoided.

Contribution

It introduces non-technical interventions that significantly reduce escalation in large language models within strategic scenarios, supporting their safe use in national security.

Findings

01

Interventions reduce escalation in LLMs during wargame simulations

02

LLMs can be aligned with security goals through simple measures

03

Calls for banning LLMs in security are premature

Abstract

U.S. national security customers have begun to utilize large language models, including enterprise versions of ``off-the-shelf'' models (e.g., ChatGPT) familiar to the public. This uptake will likely accelerate. However, recent studies suggest that off-the-shelf large language models frequently suggest escalatory actions when prompted with geopolitical or strategic scenarios. We demonstrate two simple, non-technical interventions to control these tendencies. Introducing these interventions into the experimental wargame design of a recent study, we substantially reduce escalation throughout the game. Calls to restrict the use of large language models in national security applications are thus premature. The U.S. government is already, and will continue, employing large language models for scenario planning and suggesting courses of action. Rather than warning against such applications,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.