Phase Transitions in Learning and Earning under Price Protection Guarantee
Qing Feng, Ruihao Zhu, Stefanus Jasin

TL;DR
This paper investigates how price protection guarantees influence the design and regret bounds of online dynamic pricing algorithms, revealing phase transitions in learning performance based on the protection period length.
Contribution
The paper introduces a novel regret lower bound and proposes LEAP, an algorithm that achieves near-optimal regret, highlighting phase transitions related to the price protection period.
Findings
Optimal regret is (( ilde{ heta}( ext{T})+ ext{min}ig ext{M, T}^{2/3}ig))
LEAP algorithm matches lower bounds up to logarithmic factors
Regret performance exhibits phase transitions depending on the size of M
Abstract
Motivated by the prevalence of ``price protection guarantee", which allows a customer who purchased a product in the past to receive a refund from the seller during the so-called price protection period (typically defined as a certain time window after the purchase date) in case the seller decides to lower the price, we study the impact of such policy on the design of online learning algorithm for data-driven dynamic pricing with initially unknown customer demand. We consider a setting where a firm sells a product over a horizon of time steps. For this setting, we characterize how the value of , the length of price protection period, can affect the optimal regret of the learning process. We show that the optimal regret is by first establishing a fundamental impossible regime with novel regret lower bound instances. Then, we propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Game Theory and Applications
