Online Dynamic Programming
Holakou Rahmanian, Manfred K. Warmuth, S.V.N. Vishwanathan

TL;DR
This paper introduces a unified online learning framework for combinatorial problems solvable by dynamic programming, enabling efficient algorithms for various classical problems by encoding solutions as multipaths and generalizing existing online learning algorithms.
Contribution
It develops a general method for online learning in dynamic programming problems, extending algorithms like Hedge to complex combinatorial structures called multipaths.
Findings
Unified framework for online learning in DP problems
Efficient algorithms for multiple classical combinatorial problems
New faster sampling technique for multipath distributions
Abstract
We propose a general method for combinatorial online learning problems whose offline optimization problem can be solved efficiently via a dynamic programming algorithm defined by an arbitrary min-sum recurrence. Examples include online learning of Binary Search Trees, Matrix-Chain Multiplications, -sets, Knapsacks, Rod Cuttings, and Weighted Interval Schedulings. For each of these problems we use the underlying graph of subproblems (called a multi-DAG) for defining a representation of the solutions of the dynamic programming problem by encoding them as a generalized version of paths (called multipaths). These multipaths encode each solution as a series of successive decisions or components over which the loss is linear. We then show that the dynamic programming algorithm for each problem leads to online algorithms for learning multipaths in the underlying multi-DAG. The algorithms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems
