Max-Plus Matching Pursuit for Deterministic Markov Decision Processes

Francis Bach (SIERRA)

arXiv:1906.08524·cs.LG·June 21, 2019·5 cites

Max-Plus Matching Pursuit for Deterministic Markov Decision Processes

Francis Bach (SIERRA)

PDF

Open Access

TL;DR

This paper introduces a max-plus algebra-based matching pursuit method for approximating value functions in deterministic MDPs, reducing complexity and enabling adaptive basis selection to address high-dimensional problems.

Contribution

It develops a novel max-plus algebra framework for value iteration, including adaptive basis methods inspired by signal processing, with empirical success on low-dimensional control problems.

Findings

01

Complexity depends on covering numbers, not state count

02

Adaptive basis methods improve approximation in factored state-spaces

03

Empirical results show effectiveness on simple deterministic MDPs

Abstract

We consider deterministic Markov decision processes (MDPs) and apply max-plus algebra tools to approximate the value iteration algorithm by a smaller-dimensional iteration based on a representation on dictionaries of value functions. The setup naturally leads to novel theoretical results which are simply formulated due to the max-plus algebra structure. For example, when considering a fixed (non adaptive) finite basis, the computational complexity of approximating the optimal value function is not directly related to the number of states, but to notions of covering numbers of the state space. In order to break the curse of dimensionality in factored state-spaces, we consider adaptive basis that can adapt to particular problems leading to an algorithm similar to matching pursuit from signal processing. They currently come with no theoretical guarantees but work empirically well on simple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Formal Methods in Verification · Simulation Techniques and Applications