Can We Trust Embodied Agents? Exploring Backdoor Attacks against   Embodied LLM-based Decision-Making Systems

Ruochen Jiao; Shaoyuan Xie; Justin Yue; Takami Sato; Lixu Wang; Yixuan; Wang; Qi Alfred Chen; Qi Zhu

arXiv:2405.20774·cs.CR·May 1, 2025·1 cites

Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-based Decision-Making Systems

Ruochen Jiao, Shaoyuan Xie, Justin Yue, Takami Sato, Lixu Wang, Yixuan, Wang, Qi Alfred Chen, Qi Zhu

PDF

Open Access 1 Video

TL;DR

This paper introduces a comprehensive framework for backdoor attacks on embodied LLM-based decision systems, revealing significant security vulnerabilities and demonstrating highly effective attack methods across multiple models and tasks.

Contribution

It systematically explores attack surfaces and proposes three novel backdoor attack mechanisms, highlighting vulnerabilities in embodied AI systems.

Findings

01

Nearly 100% success rate for word and knowledge injection attacks

02

Scenario manipulation attacks exceed 65% success rate, up to 90%

03

Attacks are resilient against existing defenses

Abstract

Large Language Models (LLMs) have shown significant promise in real-world decision-making tasks for embodied artificial intelligence, especially when fine-tuned to leverage their inherent common sense and reasoning abilities while being tailored to specific applications. However, this fine-tuning process introduces considerable safety and security vulnerabilities, especially in safety-critical cyber-physical systems. In this work, we propose the first comprehensive framework for Backdoor Attacks against LLM-based Decision-making systems (BALD) in embodied AI, systematically exploring the attack surfaces and trigger mechanisms. Specifically, we propose three distinct attack mechanisms: word injection, scenario manipulation, and knowledge injection, targeting various components in the LLM-based decision-making pipeline. We perform extensive experiments on representative LLMs (GPT-3.5,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-Based Decision-Making Systems· slideslive

Taxonomy

TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Information and Cyber Security