Measuring and Improving Persuasiveness of Large Language Models
Somesh Singh, Yaman K Singla, Harini SI, Balaji Krishnamurthy

TL;DR
This paper introduces PersuasionBench and PersuasionArena, benchmarks for measuring the persuasion ability of large language models, revealing that smaller models can be made highly persuasive through targeted training, with implications for societal impact and regulation.
Contribution
The paper presents the first large-scale benchmark and arena for evaluating LLMs' persuasiveness, and demonstrates how targeted training enhances persuasion regardless of model size.
Findings
Persuasiveness correlates positively with model size.
Targeted training improves smaller models' persuasion more than scale alone.
Simple metrics like FLOPs do not fully capture societal impact.
Abstract
LLMs are increasingly being used in workflows involving generating content to be consumed by humans (e.g., marketing) and also in directly interacting with humans (e.g., through chatbots). The development of such systems that are capable of generating verifiably persuasive messages presents both opportunities and challenges for society. On the one hand, such systems could positively impact domains like advertising and social good, such as addressing drug addiction, and on the other, they could be misused for spreading misinformation and shaping political opinions. To channel LLMs' impact on society, we need to develop systems to measure and benchmark their persuasiveness. With this motivation, we introduce PersuasionBench and PersuasionArena, the first large-scale benchmark and arena containing a battery of tasks to measure the persuasion ability of generative models automatically. We…
Peer Reviews
Decision·ICLR 2025 Poster
The paper has several strengths. The authors 1. carefully evaluate a range of models—across both vision and language. 2. leverage online communities and real-world feedback to curate a dataset. 3. will release a dataset of minimally different tweets, alongside metadata and an evaluation setup.
My first concern is that small changes in the input can yield significantly different meanings that cosine similarity or edit distance may not capture. Consider the Converse example in Figure 1: using an entirely different _brand_ is probably more likely to have a big effect on persuasiveness—but not because of something the writer can control. I worry that the “minor tweet edits” often have larger semantic variation: a single brand name will significantly affect popularity (as a more extreme e
This paper is well-motivated and the authors engage with literature from psychology. The methodology of the dataset collection is very good – the authors make use of similar content posted at different times and the differential engagement level as a signal. Unlike some other recent work in this area, the authors conduct a large amount of validation about some aspects of the paper (e.g. how well do human expert do; human-evaluation of the persuasiveness; how much transfer across domain there is
1) **Conceptual Framework and Claims:** The paper's framing of "persuasion" is problematic and potentially misleading. The study effectively measures social media post performance prediction and generation, not persuasion as defined in psychological literature. This measurement error also undermines the paper's conclusions about regulation and broader implications. Therefore, I would recommend the authors to reframe the work as a study of social media engagement prediction/generation without ove
- The method for collecting pairwise data is both interesting and scalable. The authors apply a rigorous filtering process, resulting in a dataset with substantial potential for future research in persuasive language modeling. - The paper presents a diverse range of experimental tasks, including various forms of transsuasion and both generative and simulative settings. This breadth in task design highlights the robustness of their approach to generating and evaluating persuasive messages. - The
Below is my original review. The authors have addressed these issues, so I've updated my score. The truthfulness of transsuaded tweets is still somewhat unclear. As an attempt to make a tweet more persuasive, the model sometimes seems to add new content and statements (including statistics). It would be nice to discuss the degree of truthfulness of these statements, its potential impact, and some mitigating methods. === - In the transsuasion task, it is unclear if the semantic meaning is cons
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods
