Mind the Gap: Time-of-Check to Time-of-Use Vulnerabilities in LLM-Enabled Agents

Derek Lilienthal; Sanghyun Hong

arXiv:2508.17155·cs.CR·August 26, 2025

Mind the Gap: Time-of-Check to Time-of-Use Vulnerabilities in LLM-Enabled Agents

Derek Lilienthal, Sanghyun Hong

PDF

TL;DR

This paper investigates time-of-check to time-of-use vulnerabilities in LLM-enabled agents, introducing a benchmark and proposing detection and mitigation techniques to improve security against such attacks.

Contribution

It is the first study to explore TOCTOU vulnerabilities in LLM agents, providing a benchmark and adapting security techniques for this context.

Findings

01

Achieved up to 25% detection accuracy with automated methods.

02

Reduced vulnerable plan generation by 3%.

03

Decreased attack window by 95%.

Abstract

Large Language Model (LLM)-enabled agents are rapidly emerging across a wide range of applications, but their deployment introduces vulnerabilities with security implications. While prior work has examined prompt-based attacks (e.g., prompt injection) and data-oriented threats (e.g., data exfiltration), time-of-check to time-of-use (TOCTOU) remain largely unexplored in this context. TOCTOU arises when an agent validates external state (e.g., a file or API response) that is later modified before use, enabling practical attacks such as malicious configuration swaps or payload injection. In this work, we present the first study of TOCTOU vulnerabilities in LLM-enabled agents. We introduce TOCTOU-Bench, a benchmark with 66 realistic user tasks designed to evaluate this class of vulnerabilities. As countermeasures, we adapt detection and mitigation techniques from systems security to this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.