Generalizing LTL Instructions via Future Dependent Options
Duo Xu, Faramarz Fekri

TL;DR
This paper introduces a multi-task reinforcement learning method that learns future-dependent options and multi-step value functions to improve generalization and efficiency in LTL-based control tasks, enabling zero-shot adaptation to new instructions.
Contribution
It proposes a novel off-policy algorithm that learns options based on future subgoals and trains a multi-step value function conditioned on subgoal sequences for better generalization and optimality.
Findings
Outperforms previous methods in LTL generalization tasks
Improves learning efficiency in complex control domains
Achieves better policy optimality through future-dependent options
Abstract
In many real-world applications of control system and robotics, linear temporal logic (LTL) is a widely-used task specification language which has a compositional grammar that naturally induces temporally extended behaviours across tasks, including conditionals and alternative realizations. An important problem in RL with LTL tasks is to learn task-conditioned policies which can zero-shot generalize to new LTL instructions not observed in the training. However, because symbolic observation is often lossy and LTL tasks can have long time horizon, previous works can suffer from issues such as training sampling inefficiency and infeasibility or sub-optimality of the found solutions. In order to tackle these issues, this paper proposes a novel multi-task RL algorithm with improved learning efficiency and optimality. To achieve the global optimality of task completion, we propose to learn…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Machine Learning and Algorithms · Receptor Mechanisms and Signaling
