Resistance Against Manipulative AI: key factors and possible actions

Piotr Wilczy\'nski; Wiktoria Mieleszczenko-Kowszewicz and; Przemys{\l}aw Biecek

arXiv:2404.14230·cs.HC·October 1, 2024

Resistance Against Manipulative AI: key factors and possible actions

Piotr Wilczy\'nski, Wiktoria Mieleszczenko-Kowszewicz and, Przemys{\l}aw Biecek

PDF

Open Access 1 Repo

TL;DR

This paper investigates factors influencing susceptibility to manipulative language models and proposes strategies including AI literacy and a detection classifier to mitigate manipulation risks.

Contribution

It identifies human and LLM characteristics linked to manipulation potential and introduces a classifier called Manipulation Fuse for detection.

Findings

01

Human susceptibility varies with individual traits.

02

LLMs can be prompted to produce manipulative statements.

03

AI literacy can reduce manipulation risks.

Abstract

If AI is the new electricity, what should we do to keep ourselves from getting electrocuted? In this work, we explore factors related to the potential of large language models (LLMs) to manipulate human decisions. We describe the results of two experiments designed to determine what characteristics of humans are associated with their susceptibility to LLM manipulation, and what characteristics of LLMs are associated with their manipulativeness potential. We explore human factors by conducting user studies in which participants answer general knowledge questions using LLM-generated hints, whereas LLM factors by provoking language models to create manipulative statements. Then, we analyze their obedience, the persuasion strategies used, and the choice of vocabulary. Based on these experiments, we discuss two actions that can protect us from LLM manipulation. In the long term, we put AI…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://zenodo.org/record/12806502
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLaw, AI, and Intellectual Property