Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata

TL;DR
This paper investigates whether large language models can autonomously improve negotiation strategies through self-play, reflection, and AI feedback, demonstrating potential for minimal human intervention in developing strong AI agents.
Contribution
It introduces a framework where LLMs improve negotiation skills via iterative self-play and feedback, highlighting the conditions under which models can learn and adapt.
Findings
Some models can improve deal prices through AI feedback
Model performance varies by role and model strength
Stronger models improve over rounds but risk breaking deals
Abstract
We study whether multiple large language models (LLMs) can autonomously improve each other in a negotiation game by playing, reflecting, and criticizing. We are interested in this question because if LLMs were able to improve each other, it would imply the possibility of creating strong AI agents with minimal human intervention. We ask two LLMs to negotiate with each other, playing the roles of a buyer and a seller, respectively. They aim to reach a deal with the buyer targeting a lower price and the seller a higher one. A third language model, playing the critic, provides feedback to a player to improve the player's negotiation strategies. We let the two agents play multiple rounds, using previous negotiation history and AI feedback as in-context demonstrations to improve the model's negotiation strategy iteratively. We use different LLMs (GPT and Claude) for different roles and use…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
