Acting upon Imagination: when to trust imagined trajectories in model based reinforcement learning
Adrian Remonda, Eduardo Veas, Granit Luzhnica

TL;DR
This paper introduces uncertainty estimation techniques for model-based reinforcement learning to determine when imagined trajectories can be trusted, reducing unnecessary re-planning and computational costs while maintaining performance.
Contribution
It proposes novel uncertainty estimation methods for online evaluation of imagined trajectories in MBRL, improving efficiency without sacrificing accuracy.
Findings
Significant reduction in computational costs.
Effective avoidance of unnecessary trajectory re-planning.
Maintained performance with fewer re-plans.
Abstract
Model-based reinforcement learning (MBRL) aims to learn model(s) of the environment dynamics that can predict the outcome of its actions. Forward application of the model yields so called imagined trajectories (sequences of action, predicted state-reward) used to optimize the set of candidate actions that maximize expected reward. The outcome, an ideal imagined trajectory or plan, is imperfect and typically MBRL relies on model predictive control (MPC) to overcome this by continuously re-planning from scratch, incurring thus major computational cost and increasing complexity in tasks with longer receding horizon. We propose uncertainty estimation methods for online evaluation of imagined trajectories to assess whether further planned actions can be trusted to deliver acceptable reward. These methods include comparing the error after performing the last action with the standard expected…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Advanced Control Systems Optimization
