Dynamic Decision-Making under Model Misspecification
Xinyu Dai

TL;DR
This paper explores how dynamic decision algorithms perform under model misspecification, showing that they can still achieve exponential convergence to a pseudo-truth set and maintain robust regret performance despite misspecification.
Contribution
It extends the concept of pseudo-truth parameters to dynamic decision problems and characterizes conditions for posterior convergence under misspecification.
Findings
Posterior convergence to pseudo-truth set under mild conditions
MAP estimates fail to converge with misspecification
Average regret remains robust despite model misspecification
Abstract
In this study, I investigate the dynamic decision problem with a finite parameter space when the functional form of conditional expected rewards is misspecified. Traditional algorithms, such as Thompson Sampling, guarantee neither an rate of posterior parameter concentration nor an rate of average regret. However, under mild conditions, we can still achieve an exponential convergence rate of the parameter to a pseudo truth set, an extension of the pseudo truth parameter concept introduced by White (1982). I further characterize the necessary conditions for the convergence of the expected posterior within this pseudo-truth set. Simulations demonstrate that while the maximum a posteriori (MAP) estimate of the parameters fails to converge under misspecification, the algorithm's average regret remains relatively robust compared to the correctly specified case. These…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation
