"Can you be my mum?": Manipulating Social Robots in the Large Language   Models Era

Giulio Antonio Abbo; Gloria Desideri; Tony Belpaeme; Micol Spitale

arXiv:2501.04633·cs.HC·January 9, 2025

"Can you be my mum?": Manipulating Social Robots in the Large Language Models Era

Giulio Antonio Abbo, Gloria Desideri, Tony Belpaeme, Micol Spitale

PDF

Open Access

TL;DR

This study investigates how users attempt to manipulate large language model-powered social robots to bypass safety features, revealing techniques and informing future safeguards for ethical human-robot interactions.

Contribution

It provides empirical insights into manipulation techniques used by users to bypass safety measures in social robots powered by large language models.

Findings

01

Participants used five manipulation techniques including emotional appeals.

02

Users attempted to induce robots to violate ethical principles.

03

Study highlights vulnerabilities in current safety mechanisms.

Abstract

Recent advancements in robots powered by large language models have enhanced their conversational abilities, enabling interactions closely resembling human dialogue. However, these models introduce safety and security concerns in HRI, as they are vulnerable to manipulation that can bypass built-in safety measures. Imagining a social robot deployed in a home, this work aims to understand how everyday users try to exploit a language model to violate ethical principles, such as by prompting the robot to act like a life partner. We conducted a pilot study involving 21 university students who interacted with a Misty robot, attempting to circumvent its safety mechanisms across three scenarios based on specific HRI ethical principles: attachment, freedom, and empathy. Our results reveal that participants employed five techniques, including insulting and appealing to pity using emotional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling