Pure Planning to Pure Policies and In Between with a Recursive Tree Planner
A. Norman Redlich

TL;DR
This paper introduces a recursive tree planner (RTP) that seamlessly integrates pure planning and policy-based methods, enabling improved performance and zero-shot transfer across diverse tasks through hierarchical generalized actions and iterative policy learning.
Contribution
The RTP framework unifies pure planning and policy execution, incorporating learned policies and generalized actions at all hierarchy levels for enhanced transferability and performance.
Findings
RTP outperforms traditional planners on Box2d and MuJoCo tasks.
Incorporating learned policies improves planning efficiency.
RTP demonstrates effective zero-shot transfer across different problem classes.
Abstract
A recursive tree planner (RTP) is designed to function as a pure planner without policies at one extreme and run a pure greedy policy at the other. In between, the RTP exploits policies to improve planning performance and improve zero-shot transfer from one class of planning problem to another. Policies are learned through imitation of the planner. These are then used by the planner to improve policies in a virtuous cycle. To improve planning performance and zero-shot transfer, the RTP incorporates previously learned tasks as generalized actions (GA) at any level of its hierarchy, and can refine those GA by adding primitive actions at any level too. For search, the RTP uses a generalized Dijkstra algorithm [Dijkstra 1959] which tries the greedy policy first and then searches over near-greedy paths and then farther away as necessary. The RPT can return multiple sub-goals from lower…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Model-Driven Software Engineering Techniques
MethodsGenetic Algorithms
