Persuasion and Safety in the Era of Generative AI
Haein Kong

TL;DR
This paper investigates the persuasive capabilities of large language models, developing a taxonomy and dataset to distinguish ethical persuasion from manipulation, thereby contributing to AI safety and ethical guidelines in generative AI.
Contribution
It introduces a taxonomy of persuasive techniques, creates a human-annotated dataset, and evaluates LLMs' ability to differentiate between ethical persuasion and manipulation.
Findings
Developed a comprehensive taxonomy of persuasive techniques
Created a dataset annotated by humans for training and evaluation
Evaluated LLMs' ability to distinguish between persuasion and manipulation
Abstract
As large language models (LLMs) achieve advanced persuasive capabilities, concerns about their potential risks have grown. The EU AI Act prohibits AI systems that use manipulative or deceptive techniques to undermine informed decision-making, highlighting the need to distinguish between rational persuasion, which engages reason, and manipulation, which exploits cognitive biases. My dissertation addresses the lack of empirical studies in this area by developing a taxonomy of persuasive techniques, creating a human-annotated dataset, and evaluating LLMs' ability to distinguish between these methods. This work contributes to AI safety by providing resources to mitigate the risks of persuasive AI and fostering discussions on ethical persuasion in the age of generative AI.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI · Misinformation and Its Impacts
