Nonparametric Learning and Earning with One-Point Feedback under Nonstationarity
Xiangyu Yang, Feng Xu, Jian-Qiang Hu, Jiaqiao Hu

TL;DR
This paper proposes a nonparametric, adaptive pricing method that learns from limited feedback and adjusts to market changes, achieving near-optimal revenue in dynamic environments.
Contribution
It introduces a novel framework combining revenue-based gradient updates with restarting and meta-learning to handle nonstationary demand without parametric assumptions.
Findings
The method achieves sublinear regret relative to an oracle benchmark.
Adaptive restarting improves learning in changing environments.
Simulation results demonstrate effectiveness on synthetic and real data.
Abstract
Firms increasingly rely on dynamic pricing to respond to evolving customer demand, yet in many applications they observe only the revenue generated by a single posted price in each period. At the same time, market conditions may shift gradually or abruptly due to changes in customer preferences, competition, or external shocks. These features create two intertwined challenges: learning the revenue--demand relationship from limited feedback and adapting pricing decisions to a changing environment. We study how a seller can learn and earn effectively under these constraints, without assuming a specific parametric form for demand. We develop a learning framework that updates prices using revenue-based gradient approximations constructed from one observation per period. To address environmental changes, we incorporate a restarting mechanism that periodically refreshes the learning process…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
