Learning to Price with Reference Effects
Abbas Kazerouni, Benjamin Van Roy

TL;DR
This paper introduces a reinforcement learning approach using Thompson sampling to optimize pricing strategies in markets where consumer demand depends on both current and past prices, balancing revenue and information gathering.
Contribution
It formulates the dynamic pricing problem with reference effects as a reinforcement learning task and provides a tractable solution with theoretical regret guarantees.
Findings
The proposed method effectively balances exploration and exploitation in pricing.
Simulation results demonstrate improved revenue performance.
Theoretical bounds show performance improves with more data.
Abstract
As a firm varies the price of a product, consumers exhibit reference effects, making purchase decisions based not only on the prevailing price but also the product's price history. We consider the problem of learning such behavioral patterns as a monopolist releases, markets, and prices products. This context calls for pricing decisions that intelligently trade off between maximizing revenue generated by a current product and probing to gain information for future benefit. Due to dependence on price history, realized demand can reflect delayed consequences of earlier pricing decisions. As such, inference entails attribution of outcomes to prior decisions and effective exploration requires planning price sequences that yield informative future outcomes. Despite the considerable complexity of this problem, we offer a tractable systematic approach. In particular, we frame the problem as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
