Inductive Policy Selection for First-Order MDPs
Sung Wook Yoon, Alan Fern, Robert Givan

TL;DR
This paper introduces a method for selecting policies in large, stochastic first-order MDPs that generalize well as the number of objects increases, using ensemble decision lists learned from small problem instances.
Contribution
It extends previous work to stochastic domains and ensemble learning, providing a scalable approach for policy induction in large first-order MDPs.
Findings
Successfully induces policies for complex stochastic first-order MDPs
Policies generalize well as the number of objects grows
Extends prior work to broader problem classes
Abstract
We select policies for large Markov Decision Processes (MDPs) with compact first-order representations. We find policies that generalize well as the number of objects in the domain grows, potentially without bound. Existing dynamic-programming approaches based on flat, propositional, or first-order representations either are impractical here or do not naturally scale as the number of objects grows without bound. We implement and evaluate an alternative approach that induces first-order policies using training data constructed by solving small problem instances using PGraphplan (Blum & Langford, 1999). Our policies are represented as ensembles of decision lists, using a taxonomic concept language. This approach extends the work of Martin and Geffner (2000) to stochastic domains, ensemble learning, and a wider variety of problems. Empirically, we find "good" policies for several…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Reinforcement Learning in Robotics · AI-based Problem Solving and Planning
