Optimal amortized regret in every interval
Rina Panigrahy, Preyas Popat

TL;DR
This paper introduces a randomized algorithm that guarantees near-optimal regret bounds in any interval for sequence prediction, with practical evaluation on stock data.
Contribution
It presents a novel randomized algorithm achieving amortized $O(\,\sqrt{x})$ regret in any interval, improving interval-specific performance guarantees.
Findings
Achieves $O(\,\sqrt{x})$ regret in any interval
Constant factor in regret bound estimated around 2.1 for sequences up to length 2000
Effective in high-frequency stock data prediction
Abstract
Consider the classical problem of predicting the next bit in a sequence of bits. A standard performance measure is {\em regret} (loss in payoff) with respect to a set of experts. For example if we measure performance with respect to two constant experts one that always predicts 0's and another that always predicts 1's it is well known that one can get regret with respect to the best expert by using, say, the weighted majority algorithm. But this algorithm does not provide performance guarantee in any interval. There are other algorithms that ensure regret in any interval of length . In this paper we show a randomized algorithm that in an amortized sense gets a regret of for any interval when the sequence is partitioned into intervals arbitrarily. We empirically estimated the constant in the for upto 2000 and found it to be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Machine Learning and Algorithms
