Algorithmic Collusion in Dynamic Pricing with Deep Reinforcement Learning
Shidi Deng, Maximilian Schiffer, Martin Bichler

TL;DR
This paper investigates the potential for collusive pricing behavior emerging from deep reinforcement learning algorithms in competitive markets, revealing that algorithm choice and market conditions influence collusion severity.
Contribution
It extends prior research by analyzing off- and on-policy deep reinforcement learning algorithms, providing nuanced insights into algorithmic collusion in dynamic pricing.
Findings
TQL shows higher collusion and price dispersion than DRL algorithms.
Collusion severity varies with market environment and algorithm type.
Proximal Policy Optimization is less prone to collusive outcomes.
Abstract
Nowadays, a significant share of the Business-to-Consumer sector is based on online platforms like Amazon and Alibaba and uses Artificial Intelligence for pricing strategies. This has sparked debate on whether pricing algorithms may tacitly collude to set supra-competitive prices without being explicitly designed to do so. Our study addresses these concerns by examining the risk of collusion when Reinforcement Learning algorithms are used to decide on pricing strategies in competitive markets. Prior research in this field focused on Tabular Q-learning (TQL) and led to opposing views on whether learning-based algorithms can lead to supra-competitive prices. Our work contributes to this ongoing discussion by providing a more nuanced numerical study that goes beyond TQL by additionally capturing off- and on-policy Deep Reinforcement Learning (DRL) algorithms. We study multiple Bertrand…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · Auction Theory and Applications · Complex Systems and Time Series Analysis
MethodsSparse Evolutionary Training · Q-Learning
