One Model, All Roles: Multi-Turn, Multi-Agent Self-Play Reinforcement Learning for Conversational Social Intelligence

Bowen Jiang; Taiwei Shi; Ryo Kamoi; Yuan Yuan; Camillo J. Taylor; Longqi Yang; Pei Zhou; Sihao Chen

arXiv:2602.03109·cs.CL·February 4, 2026

One Model, All Roles: Multi-Turn, Multi-Agent Self-Play Reinforcement Learning for Conversational Social Intelligence

Bowen Jiang, Taiwei Shi, Ryo Kamoi, Yuan Yuan, Camillo J. Taylor, Longqi Yang, Pei Zhou, Sihao Chen

PDF

Open Access

TL;DR

This paper presents OMAR, a reinforcement learning framework enabling a single AI model to develop social intelligence through multi-turn, multi-agent self-play, capturing complex social norms and behaviors in conversations.

Contribution

Introducing OMAR, a novel multi-agent self-play reinforcement learning approach that trains a single model to understand and exhibit complex social behaviors in conversations.

Findings

01

Models develop empathy, persuasion, and compromise skills.

02

Effective learning of social norms without human supervision.

03

Demonstrates emergent social intelligence in competitive scenarios.

Abstract

This paper introduces OMAR: One Model, All Roles, a reinforcement learning framework that enables AI to develop social intelligence through multi-turn, multi-agent conversational self-play. Unlike traditional paradigms that rely on static, single-turn optimizations, OMAR allows a single model to role-play all participants in a conversation simultaneously, learning to achieve long-term goals and complex social norms directly from dynamic social interaction. To ensure training stability across long dialogues, we implement a hierarchical advantage estimation that calculates turn-level and token-level advantages. Evaluations in the SOTOPIA social environment and Werewolf strategy games show that our trained models develop fine-grained, emergent social intelligence, such as empathy, persuasion, and compromise seeking, demonstrating the effectiveness of learning collaboration even under…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in Service Interactions · Social Robot Interaction and HRI · Emotion and Mood Recognition