Safety Guardrails for LLM-Enabled Robots
Zachary Ravichandran, Alexander Robey, Vijay Kumar, George J. Pappas, and Hamed Hassani

TL;DR
This paper introduces RoboGuard, a two-stage safety framework for LLM-enabled robots that contextualizes safety rules and enforces them through logical control, significantly reducing unsafe behaviors in real-world scenarios.
Contribution
The paper presents RoboGuard, a novel two-stage safety architecture that grounds safety rules in the environment and enforces them via temporal logic control synthesis, addressing both LLM and physical safety concerns.
Findings
Reduces unsafe plan execution from over 92% to below 3%.
Demonstrates robustness against jailbreaking and adaptive attacks.
Ensures safety without compromising performance.
Abstract
Although the integration of large language models (LLMs) into robotics has unlocked transformative capabilities, it has also introduced significant safety concerns, ranging from average-case LLM errors (e.g., hallucinations) to adversarial jailbreaking attacks, which can produce harmful robot behavior in real-world settings. Traditional robot safety approaches do not address the contextual vulnerabilities of LLMs, and current LLM safety approaches overlook the physical risks posed by robots operating in real-world environments. To ensure the safety of LLM-enabled robots, we propose RoboGuard, a two-stage guardrail architecture. RoboGuard first contextualizes pre-defined safety rules by grounding them in the robot's environment using a root-of-trust LLM. This LLM is shielded from malicious prompts and employs chain-of-thought (CoT) reasoning to generate context-dependent safety…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTransportation Safety and Impact Analysis · Real-time simulation and control systems
