Learning from Synthetic Labs: Language Models as Auction Participants

Anand Shah; Kehang Zhu; Yanchen Jiang; Jeffrey G. Wang; Arif K. Dayi; John J. Horton; David C. Parkes

arXiv:2507.09083·cs.GT·July 15, 2025

Learning from Synthetic Labs: Language Models as Auction Participants

Anand Shah, Kehang Zhu, Yanchen Jiang, Jeffrey G. Wang, Arif K. Dayi, John J. Horton, David C. Parkes

PDF

3 Reviews

TL;DR

This paper demonstrates that large language models can simulate auction behaviors consistent with economic theory and human patterns, providing a cost-effective framework for auction research and design using LLMs as proxies.

Contribution

It introduces a novel synthetic data-generating process using LLMs for auction studies, showing their ability to replicate classic auction behaviors and serve as flexible, low-cost experimental proxies.

Findings

01

LLMs with chain of thought reasoning align with experimental auction results

02

LLMs exhibit risk-averse behavior similar to humans in auctions

03

Proper prompting improves LLM predictions towards theoretical models

Abstract

This paper investigates the behavior of simulated AI agents (large language models, or LLMs) in auctions, introducing a novel synthetic data-generating process to help facilitate the study and design of auctions. We find that LLMs -- when endowed with chain of thought reasoning capacity -- agree with the experimental literature in auctions across a variety of classic auction formats. In particular, we find that LLM bidders produce results consistent with risk-averse human bidders; that they perform closer to theoretical predictions in obviously strategy-proof auctions; and, that they succumb to the winner's curse in common value settings. On prompting, we find that LLMs are not very sensitive to naive changes in prompts (e.g., language, currency) but can improve dramatically towards theoretical predictions with the right mental model (i.e., the language of Nash deviations). We run…

Peer Reviews

Decision·ICLR 2026 Conference Withdrawn Submission

Reviewer 01Rating 2Confidence 4

Strengths

This paper demonstrates at scale how LLMs can mimic human behavior (or approach theoretical results) in thousands of auction scenarios. I appreciate authors looking into LLM simulations, as it has immense potential in simulating real-world experiments, feeding into the direction of AI-driven science. Overall, the paper is well written.

Weaknesses

- I completely fail to understand the technical novelty of this paper. Considering the ICLR audience, I do think the papers need to explicitly demonstrate why the proposed framework works as presented. - While the finding is exciting (still a bit limited in terms of applicability), it is hard to assess what alternative approach would have worked to achieve similar performance. The work is plagued by a dire lack of reasonable baselines. - In fact, as far as I remember, LLM's ability to perform

Reviewer 02Rating 0Confidence 4

Strengths

The question of whether LLM agents can substitute for human subjects in economic mechanism experiments is timely and interesting.

Weaknesses

1. Poor clarity and factual accuracy: Important auction acronyms are not clearly defined in the Introduction; figures use very small fonts; and at least one factual claim appears misleading (e.g., attributing inspiration to prior work purportedly using GPT-4o where, to the best of my knowledge, Horton23 used GPT-3). 2. Several headline comparisons rely on different granularity and thresholds than the classic studies (e.g., different numbers of bidders, coarser bid grids, broader “match-to-value”

Reviewer 03Rating 6Confidence 4

Strengths

1. The paper offers a rigorous and timely analysis at the intersection of AI, economic theory, and human behavior. It provides valuable insights into how closely LLMs align with theoretical predictions and experimental findings with human participants, highlighting their potential as tools for behavioral economics simulations. In fact, regardless of the degree of alignment with human behavior, I believe that understanding LLM behavior in complex decision-making is crucial, as such models are inc

Weaknesses

1. The paper’s literature review is somewhat underdeveloped, making it difficult to situate its contribution within the broader research landscape. Strengthening this section would help clarify its novelty and relevance. In particular, it would benefit from engaging with adjacent lines of work, including: - studies using LLMs to simulate strategic behavior beyond auction settings [1,2,3]; - research on predicting and interpreting human decision-making with LLMs [4,5]; - alternative approaches to

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDropout · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Layer Normalization · Dense Connections · Softmax · Transformer · GPT-4