Symbolic Generalization for On-line Planning
Zhengzhu Feng, Eric A. Hansen, Shlomo Zilberstein

TL;DR
This paper introduces sRTDP, a symbolic extension of RTDP, which uses model-checking to generalize experience across states, improving on-line planning efficiency and reducing real-world interactions.
Contribution
The paper presents sRTDP, a novel symbolic on-line planning algorithm that enhances RTDP with model-checking for state generalization, improving efficiency and reducing interaction costs.
Findings
sRTDP accelerates planning in terms of CPU time.
sRTDP reduces the number of environment interactions needed.
Heuristic grouping methods significantly improve performance.
Abstract
Symbolic representations have been used successfully in off-line planning algorithms for Markov decision processes. We show that they can also improve the performance of on-line planners. In addition to reducing computation time, symbolic generalization can reduce the amount of costly real-world interactions required for convergence. We introduce Symbolic Real-Time Dynamic Programming (or sRTDP), an extension of RTDP. After each step of on-line interaction with an environment, sRTDP uses symbolic model-checking techniques to generalizes its experience by updating a group of states rather than a single state. We examine two heuristic approaches to dynamic grouping of states and show that they accelerate the planning process significantly in terms of both CPU time and the number of steps of interaction with the environment.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Reinforcement Learning in Robotics · Software Testing and Debugging Techniques
