Practical Linear Value-approximation Techniques for First-order MDPs
Scott Sanner, Craig Boutilier

TL;DR
This paper advances linear value-approximation methods for first-order MDPs by extending to policy iteration, automating basis function generation, and decomposing complex problems, with empirical validation on logistics benchmarks.
Contribution
It introduces a first-order approximate policy iteration framework, automates basis function creation, and proposes problem decomposition techniques for first-order MDPs.
Findings
Enhanced value function approximation quality.
Effective problem decomposition for intractable domains.
Empirical validation on logistics planning benchmarks.
Abstract
Recent work on approximate linear programming (ALP) techniques for first-order Markov Decision Processes (FOMDPs) represents the value function linearly w.r.t. a set of first-order basis functions and uses linear programming techniques to determine suitable weights. This approach offers the advantage that it does not require simplification of the first-order value function, and allows one to solve FOMDPs independent of a specific domain instantiation. In this paper, we address several questions to enhance the applicability of this work: (1) Can we extend the first-order ALP framework to approximate policy iteration to address performance deficiencies of previous approaches? (2) Can we automatically generate basis functions and evaluate their impact on value function quality? (3) How can we decompose intractable problems with universally quantified rewards into tractable subproblems? We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Formal Methods in Verification · Machine Learning and Algorithms
