ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind
Peixuan Han, Zijia Liu, Jiaxuan You

TL;DR
ToMAP introduces a novel opponent-aware training method for large language models, enhancing their persuasion capabilities by incorporating Theory of Mind modules, leading to more effective and diverse arguments in dialogue.
Contribution
The paper presents ToMAP, a new approach that integrates Theory of Mind modules into LLM persuaders, significantly improving their opponent modeling and persuasion effectiveness.
Findings
ToMAP outperforms larger baselines like GPT-4o by 39.4% in persuasion tasks.
ToMAP generates more diverse and effective arguments with complex reasoning.
ToMAP is suitable for long conversations and employs logical, opponent-aware strategies.
Abstract
Large language models (LLMs) have shown promising potential in persuasion, but existing works on training LLM persuaders are still preliminary. Notably, while humans are skilled in modeling their opponent's thoughts and opinions proactively and dynamically, current LLMs struggle with such Theory of Mind (ToM) reasoning, resulting in limited diversity and opponent awareness. To address this limitation, we introduce Theory of Mind Augmented Persuader (ToMAP), a novel approach for building more flexible persuader agents by incorporating two theory of mind modules that enhance the persuader's awareness and analysis of the opponent's mental state. Specifically, we begin by prompting the persuader to consider possible objections to the target central claim, and then use a text encoder paired with a trained MLP classifier to predict the opponent's current stance on these counterclaims. Our…
Peer Reviews
Decision·Submitted to ICLR 2026
The paper is overall well-written and clear. The proposed framework makes intuitive sense. I appreciate the authors’ relatively thorough ablation studies and the human-validation of the LLM judge model.
Results overall are inconsistent and somewhat unconvincing. 1) if we look at the results in Table 1, it very much seem to be the case that whether or not the proposed framework works depends on the persuade model; RL setting outperform ToMAP when LLaMa3.1 is the persuade; SFT seems to outperform both RL and ToMAP when Phi-4 is the persuade model; The simple averaging of scores in the "Avg." column obscures these crucial inconsistencies and paints an overly optimistic picture. Additionally, agg
1. The reinforcement learning setup is carefully defined with detailed reward formulation, ToM module integration, and training pipeline transparency, which can be a good step for social LLMs 2. Experiments span multiple datasets, persuasive contexts, and opponent models, showing generalizability. 3. The strategy taxonomy and qualitative examples give a decent idea of how ToMAP produces more persuasive arguments.
1. Though ablations exist, it remains partially unclear to me whether improvements stem from ToM features themselves or increased contextual conditioning capacity. 2. The critic is noisy and prone to reward hacking; The RM, is an LLM and as the authors themselves discuss recent work show that LLMs are not adept at it. 3. To claim RL+ToM does the improvement, the confounder of ToM needs to be removed to see if it is RL (i.e. supervision) that does the improvement or RL+ToM
1. Adding ToM into persuasion is something I haven’t seen much of, and I think it’s really innovative. It’s a fresh approach that feels like it could be the next step in making LLMs more human-like in how they reason. 2. The experiments are thorough and convincing. They test ToMAP against a range of baselines and show how much better it performs.The ablation study proved that both Tom modules are important. 3. It could be very useful in real-world scenarios and has excellent application prospect
1. There's too little human evaluation. Section 5.1 only had human judges evaluate 50 conversations, then said LLMs and human judgment were aligned, and that was it. 50 samples is far too few, and only the QWen-7B persuade was tested. 2. The long conversation experiment has problems. The paper mentions long conversations but skips details on how to avoid repetition within them. Figure 5 says that RL plateau even declines after 3 turns, but ToMAP keeps increasing. But you trained it using 3 turns
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · AI in Service Interactions
