No-Regret Learning in Dynamic Competition with Reference Effects Under Logit Demand
Mengzi Amy Guo, Donghao Ying, Javad Lavaei, Zuo-Jun Max Shen

TL;DR
This paper introduces an online learning algorithm for dynamic price competition under logit demand, demonstrating convergence to a stable equilibrium without traditional stability assumptions, with proven convergence rates.
Contribution
It proposes the OPGA algorithm for stable equilibrium learning in competitive markets with reference effects, even without strong monotonicity or variational stability.
Findings
OPGA converges to the stationary Nash equilibrium under diminishing step-sizes.
The convergence rate of OPGA is established as O(1/t).
The algorithm achieves no-regret learning and market stability.
Abstract
This work is dedicated to the algorithm design in a competitive framework, with the primary goal of learning a stable equilibrium. We consider the dynamic price competition between two firms operating within an opaque marketplace, where each firm lacks information about its competitor. The demand follows the multinomial logit (MNL) choice model, which depends on the consumers' observed price and their reference price, and consecutive periods in the repeated games are connected by reference price updates. We use the notion of stationary Nash equilibrium (SNE), defined as the fixed point of the equilibrium pricing policy for the single-period game, to simultaneously capture the long-run market equilibrium and stability. We propose the online projected gradient ascent algorithm (OPGA), where the firms adjust prices using the first-order derivatives of their log-revenues that can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Game Theory and Applications
