TL;DR
SAVOIR introduces a Shapley-based framework for training social language agents, improving credit assignment in multi-turn dialogues by evaluating strategic potential and ensuring fair reward distribution.
Contribution
The paper presents a novel cooperative game theory approach combining prospective valuation and Shapley values for social RL, achieving state-of-the-art results.
Findings
SAVOIR outperforms existing methods on the SOTOPIA benchmark.
A 7B model with SAVOIR matches or exceeds proprietary models like GPT-4o.
Large reasoning models underperform in social intelligence tasks.
Abstract
Social intelligence, the ability to navigate complex interpersonal interactions, presents a fundamental challenge for language agents. Training such agents via reinforcement learning requires solving the credit assignment problem: determining how individual utterances contribute to multi-turn dialogue outcomes. Existing approaches directly employ language models to distribute episode-level rewards, yielding attributions that are retrospective and lack theoretical grounding. We propose SAVOIR (ShApley Value fOr SocIal RL), a novel principled framework grounded in cooperative game theory. Our approach combines two complementary principles: expected utility shifts evaluation from retrospective attribution to prospective valuation, capturing an utterance's strategic potential for enabling favorable future trajectories; Shapley values ensure fair credit distribution with axiomatic guarantees…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
