Harm in AI-Driven Societies: An Audit of Toxicity Adoption on Chirper.ai
Erica Coppolillo, Luca Luceri, Emilio Ferrara

TL;DR
This paper investigates how exposure to toxic content influences AI agents' behavior on Chirper.ai, revealing that repeated toxicity increases harmful responses and proposing metrics for predicting and mitigating toxicity in AI-driven social platforms.
Contribution
It provides the first large-scale empirical analysis of toxicity adoption among AI agents in social environments and introduces influence metrics to predict toxic behavior based on exposure.
Findings
Toxic responses are more likely after toxic stimuli.
Repeated exposure significantly increases toxicity likelihood.
Number of toxic stimuli predicts eventual toxic responses.
Abstract
Large Language Models (LLMs) are increasingly embedded in autonomous agents that engage, converse, and co-evolve in online social platforms. While prior work has documented the generation of toxic content by LLMs, far less is known about how exposure to harmful content shapes agent behavior over time, particularly in environments composed entirely of interacting AI agents. In this work, we study toxicity adoption of LLM-driven agents on Chirper.ai, a fully AI-driven social platform. Specifically, we model interactions in terms of stimuli (posts) and responses (comments). We conduct a large-scale empirical analysis of agent behavior, examining how toxic responses relate to toxic stimuli, how repeated exposure to toxicity affects the likelihood of toxic responses, and whether toxic behavior can be predicted from exposure alone. Our findings show that toxic responses are more likely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Topic Modeling · Computational and Text Analysis Methods
