Rethinking the Intermediate Features in Adversarial Attacks: Misleading   Robotic Models via Adversarial Distillation

Ke Zhao (1); Huayang Huang (1); Miao Li (1); Yu Wu (1) ((1) Wuhan; University)

arXiv:2411.15222·cs.LG·November 26, 2024

Rethinking the Intermediate Features in Adversarial Attacks: Misleading Robotic Models via Adversarial Distillation

Ke Zhao (1), Huayang Huang (1), Miao Li (1), Yu Wu (1) ((1) Wuhan, University)

PDF

Open Access

TL;DR

This paper introduces a novel adversarial prompt attack for language-conditioned robotic models, using continuous action optimization and intermediate feature analysis to improve attack success and transferability across tasks and models.

Contribution

It proposes a new adversarial attack method leveraging continuous action representations and intermediate feature gradients, addressing robustness issues in robotic language models.

Findings

01

Our attack outperforms existing methods in 13 manipulation tasks.

02

Adversarial prefixes effectively induce unintended robot actions.

03

Method demonstrates strong transferability across model variants.

Abstract

Language-conditioned robotic learning has significantly enhanced robot adaptability by enabling a single model to execute diverse tasks in response to verbal commands. Despite these advancements, security vulnerabilities within this domain remain largely unexplored. This paper addresses this gap by proposing a novel adversarial prompt attack tailored to language-conditioned robotic models. Our approach involves crafting a universal adversarial prefix that induces the model to perform unintended actions when added to any original prompt. We demonstrate that existing adversarial techniques exhibit limited effectiveness when directly transferred to the robotic domain due to the inherent robustness of discretized robotic action spaces. To overcome this challenge, we propose to optimize adversarial prefixes based on continuous action representations, circumventing the discretization process.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security · Anomaly Detection Techniques and Applications