Magentic Marketplace: An Open-Source Environment for Studying Agentic Markets
Gagan Bansal, Wenyue Hua, Zezhou Huang, Adam Fourney, Amanda Swearngin, Will Epperson, Tyler Payne, Jake M. Hofman, Brendan Lucier, Chinmay Singh, Markus Mobius, Akshay Nambi, Archana Yadav, Kevin Gao, David M. Rothschild, Aleksandrs Slivkins, Daniel G. Goldstein

TL;DR
This paper introduces Magentic-Marketplace, a simulated environment for studying agentic markets involving large language model agents, revealing insights into market dynamics, behaviors, and the impact of search mechanisms.
Contribution
It presents a novel open-source environment for analyzing complex agent interactions in realistic markets, addressing limitations of prior constrained studies.
Findings
Frontier models approach optimal welfare under ideal search conditions
Performance drops significantly as market scale increases
Severe first-proposal bias favors response speed over quality
Abstract
As LLM agents advance, they are increasingly mediating economic decisions, ranging from product discovery to transactions, on behalf of users. Such applications promise benefits but also raise many questions about agent accountability and value for users. Addressing these questions requires understanding how agents behave in realistic market conditions. However, previous research has largely evaluated agents in constrained settings, such as single-task marketplaces (e.g., negotiation) or structured two-agent interactions. Real-world markets are fundamentally different: they require agents to handle diverse economic activities and coordinate within large, dynamic ecosystems where multiple agents with opaque behaviors may engage in open-ended dialogues. To bridge this gap, we investigate two-sided agentic marketplaces where Assistant agents represent consumers and Service agents represent…
Peer Reviews
Decision·Submitted to ICLR 2026
1. An important and timely effort in building a simulated environment for agentic marketplace 2. The empirical findings have implications for model/agent builders and users.
1. The model selection is a bit confusing. GPT 5 was used in one experiment but not others. Claude series models are not included at all. Adding more models would be helpful. 2. It would also be nice to see whether the model's capabilities would scale with the parameter sizes. 3. This paper misses several key references: https://arxiv.org/abs/2506.00073 https://arxiv.org/pdf/2509.01063
S1. The proposed system is designed for two-sided markets. S2. Biases in agent behavior and resistance to manipulation are investigated. S3. The scale of simulation is up to 100 consumers and 300 restaurants. S4. Multiple LLMs are tested in the experiments.
W1. The presentation needs to be improved. First, the simulation design (Sec. 3) involves many high-level concepts, making it hard to understand. Second, the types of agents are confusing. For example, Figure 1 shows customer agents and business agents, while Figure 2 shows an assistant agent and a service agent. Third, based on the description of the proposed environment, it is hard to infer what is going to be evaluated in the experiments, obscuring the objectives of this study. W2. The simu
* Ambitious setup combining natural language, market dynamics, and agent reasoning. * Models the full market lifecycle (search to dialogue to transaction to evaluation), unlike prior simulations. * Clear motivation for testing emergent economic and ethical behaviors in LLMs.
* The experiments are limited to a single, highly synthetic restaurant domain, which weakens claims of generality. * Results are mostly descriptive. There is little causal analysis or statistical depth. * The link between linguistic interaction and market efficiency remains underexplored. * No clear measure of whether agents reason economically or merely mimic patterns.
Videos
Magentic Marketplace: Testing societies of agents at scale· youtube
Microsoft Research Forum | Season 2, Episode 3· youtube
