Training Language Models for Bilateral Trade with Private Information
Dirk Bergemann, Soheil Ghili, Xinyang Hu, Chuanhao Li, Zhuoran Yang

TL;DR
This paper develops a structured negotiation environment for large language models to evaluate and improve their bilateral trade strategies, combining benchmark experiments and reinforcement learning training.
Contribution
It introduces a novel environment for LLM negotiation, analyzes effective bargaining strategies, and demonstrates training methods to enhance surplus and deal rates.
Findings
Effective strategies involve price discrimination through sequential offers.
Aggressive anchoring and patience increase surplus share and deal rate.
Training with supervised fine-tuning and reinforcement learning affects trade outcomes.
Abstract
Bilateral bargaining under incomplete information provides a controlled testbed for evaluating large language model (LLM) agent capabilities. Bilateral trade demands individual rationality, strategic surplus maximization, and cooperation to realize gains from trade. We develop a structured bargaining environment where LLMs negotiate via tool calls within an event-driven simulator, separating binding offers from natural-language messages to enable automated evaluation. The environment serves two purposes: as a benchmark for frontier models and as a training environment for open-weight models via reinforcement learning. In benchmark experiments, a round-robin tournament among five frontier models (15,000 negotiations) reveals that effective strategies implement price discrimination through sequential offers. Aggressive anchoring, calibrated concession, and temporal patience correlate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
