Approximate Dynamic Programming based on High Dimensional Model Representation
Miroslav Pi\v{s}t\v{e}k

TL;DR
This paper presents an HDMR-based algorithm for approximate dynamic programming that significantly reduces memory usage and enables fast minimization of Bellman equations, demonstrated on a multi-armed bandit problem.
Contribution
Introduces an implicit HDMR approach for Bellman equations that improves efficiency and scalability in dynamic programming.
Findings
Reduces memory demands of dynamic programming algorithms.
Enables fast approximate minimization via eigenvalue decomposition.
Successfully applied to N-armed bandit problem.
Abstract
This article introduces an algorithm for implicit High Dimensional Model Representation (HDMR) of the Bellman equation. This approximation technique reduces memory demands of the algorithm considerably. Moreover, we show that HDMR enables fast approximate minimization which is essential for evaluation of the Bellman function. In each time step, the problem of parametrized HDMR minimization is relaxed into trust region problems, all sharing the same matrix. Finding its eigenvalue decomposition, we effectively achieve estimates of all minima. Their full-domain representation is avoided by HDMR and then the same approach is used recursively in the next time step. An illustrative example of N-armed bandid problem is included. We assume that the newly established connection between approximate HDMR minimization and the trust region problem can be beneficial also to many other applications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Advanced Control Systems Optimization
