RAMP: Hybrid DRL for Online Learning of Numeric Action Models
Yarin Benyamin, Argaman Mordoch, Shahaf S. Shperberg, Roni Stern

TL;DR
RAMP introduces an online hybrid approach combining deep reinforcement learning and numeric planning to learn action models efficiently from environment interactions.
Contribution
It presents a novel integrated framework that enables online learning of numeric action models through a positive feedback loop between RL and planning.
Findings
RAMP outperforms PPO in solvability and plan quality on standard domains.
The framework effectively combines RL and planning for numeric domains.
Numeric PDDLGym facilitates conversion of numeric problems to Gym environments.
Abstract
Automated planning algorithms require an action model specifying the preconditions and effects of each action, but obtaining such a model is often hard. Learning action models from observations is feasible, but existing algorithms for numeric domains are offline, requiring expert traces as input. We propose the Reinforcement learning, Action Model learning, and Planning (RAMP) strategy for learning numeric planning action models online via interactions with the environment. RAMP simultaneously trains a Deep Reinforcement Learning (DRL) policy, learns a numeric action model from past interactions, and uses that model to plan future actions when possible. These components form a positive feedback loop: the RL policy gathers data to refine the action model, while the planner generates plans to continue training the RL policy. To facilitate this integration of RL and numeric planning, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
