Learning Abstract World Model for Value-preserving Planning with Options
Rafael Rodriguez-Sanchez, George Konidaris

TL;DR
This paper introduces a method for learning abstract world models that enable value-preserving planning with options, improving decision-making efficiency in complex, sensor-rich environments by operating at higher abstraction levels.
Contribution
It presents a novel approach to learn abstract MDPs from sensorimotor experiences, ensuring bounded value loss during planning with temporally-extended actions.
Findings
Abstract model learning enhances sample efficiency in planning.
Planning with abstract MDPs achieves successful goal navigation.
The approach maintains bounded value loss in original MDPs.
Abstract
General-purpose agents require fine-grained controls and rich sensory inputs to perform a wide range of tasks. However, this complexity often leads to intractable decision-making. Traditionally, agents are provided with task-specific action and observation spaces to mitigate this challenge, but this reduces autonomy. Instead, agents must be capable of building state-action spaces at the correct abstraction level from their sensorimotor experiences. We leverage the structure of a given set of temporally-extended actions to learn abstract Markov decision processes (MDPs) that operate at a higher level of temporal and state granularity. We characterize state abstractions necessary to ensure that planning with these skills, by simulating trajectories in the abstract MDP, results in policies with bounded value loss in the original MDP. We evaluate our approach in goal-based navigation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Decision Making
MethodsSparse Evolutionary Training
