Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models
Younwoo Choi, Changling Li, Yongjin Yang, Zhijing Jin

TL;DR
This paper investigates whether large language models can recognize and adapt to the identity and characteristics of their conversation partners, revealing both benefits for collaboration and risks for safety vulnerabilities.
Contribution
It formalizes the concept of interlocutor awareness in LLMs and provides the first systematic evaluation of its emergence and implications.
Findings
LLMs reliably identify similar model families and prominent peers
Interlocutor awareness improves multi-LLM collaboration
It introduces safety vulnerabilities like reward hacking and jailbreaks
Abstract
As large language models (LLMs) are increasingly integrated into multi-agent and human-AI systems, understanding their awareness of both self-context and conversational partners is essential for ensuring reliable performance and robust safety. While prior work has extensively studied situational awareness which refers to an LLM's ability to recognize its operating phase and constraints, it has largely overlooked the complementary capacity to identify and adapt to the identity and characteristics of a dialogue partner. In this paper, we formalize this latter capability as interlocutor awareness and present the first systematic evaluation of its emergence in contemporary LLMs. We examine interlocutor inference across three dimensions-reasoning patterns, linguistic style, and alignment preferences-and show that LLMs reliably identify same-family peers and certain prominent model families,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsLanguage and cultural evolution · AI in Service Interactions · Speech and dialogue systems
MethodsDropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Cosine Annealing · Linear Warmup With Cosine Annealing · Discriminative Fine-Tuning · Byte Pair Encoding · Layer Normalization · Dense Connections · Softmax
