Exploiting First-Order Regression in Inductive Policy Selection
Charles Gretton, Sylvie Thiebaux

TL;DR
This paper introduces a hybrid approach combining first-order regression and inductive methods to compute optimal policies for relational MDPs, improving efficiency by focusing on domain-specific concepts.
Contribution
It proposes automatically generating a hypotheses language via first-order regression to enhance inductive policy learning in relational MDPs.
Findings
Effective integration of symbolic and inductive techniques.
Reduction in reasoning complexity compared to pure symbolic dynamic programming.
Improved policy quality for relational domains.
Abstract
We consider the problem of computing optimal generalised policies for relational Markov decision processes. We describe an approach combining some of the benefits of purely inductive techniques with those of symbolic dynamic programming methods. The latter reason about the optimal value function using first-order decision theoretic regression and formula rewriting, while the former, when provided with a suitable hypotheses language, are capable of generalising value functions or policies for small instances. Our idea is to use reasoning and in particular classical first-order regression to automatically generate a hypotheses language dedicated to the domain at hand, which is then used as input by an inductive solver. This approach avoids the more complex reasoning of symbolic dynamic programming while focusing the inductive solver's attention on concepts that are specifically relevant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Formal Methods in Verification · Logic, Reasoning, and Knowledge
