Finite-Time 4-Expert Prediction Problem
Erhan Bayraktar, Ibrahim Ekren, Xin Zhang

TL;DR
This paper explicitly solves a nonlinear PDE related to the finite-horizon expert prediction problem with four experts, establishing strategies as asymptotic Nash equilibria and proving a key regret conjecture.
Contribution
It provides an explicit solution to the PDE for four experts, confirming the optimality of certain strategies and validating a conjecture on regret bounds.
Findings
Solution to the PDE is twice continuously differentiable.
Strategies form an asymptotic Nash equilibrium.
Proves the finite vs geometric regret conjecture for four experts.
Abstract
We explicitly solve the nonlinear PDE that is the continuous limit of dynamic programming of \emph{expert prediction problem} in finite horizon setting with experts. The \emph{expert prediction problem} is formulated as a zero sum game between a player and an adversary. By showing that the solution is , we are able to show that the strategies conjectured in arXiv:1409.3040G form an asymptotic Nash equilibrium. We also prove the "Finite vs Geometric regret" conjecture proposed in arXiv:1409.3040G for , and and show that this conjecture in fact follows from the conjecture that the comb strategies are optimal.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Machine Learning and Algorithms
