From Best Responses to Learning: Investment Efficiency in Dynamic Environment
Ce Li, Qianfan Zhang, Weiqiang Zheng

TL;DR
This paper investigates how welfare guarantees in dynamic environments are maintained when investors use online learning algorithms instead of best responses, bridging mechanism design with online learning theory.
Contribution
It extends welfare analysis from static to dynamic settings, providing tight bounds for approximation ratios under learning-based strategies.
Findings
Approximation ratio remains unchanged against the best-in-hindsight benchmark.
Tight bounds are established for time-varying benchmarks.
Robust welfare guarantees are achievable with learning-based strategies.
Abstract
We study the welfare of a mechanism in a dynamic environment where a learning investor can make a costly investment to change her value. In many real-world problems, the common assumption that the investor always makes the best responses, i.e., choosing her utility-maximizing investment option, is unrealistic due to incomplete information in a dynamically evolving environment. To address this, we consider an investor who uses a no-regret online learning algorithm to adaptively select investments through repeated interactions with the environment. We analyze how the welfare guarantees of approximation allocation algorithms extend from static to dynamic settings when the investor learns rather than best-responds, by studying the approximation ratio for optimal welfare as a measurement of an algorithm's performance against different benchmarks in the dynamic learning environment. First, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Auction Theory and Applications
