Grounding LTL Tasks in Sub-Symbolic RL Environments for Zero-Shot Generalization
Matteo Pannacci, Andrea Fanti, Elena Umili, Roberto Capobianco

TL;DR
This paper presents a method for training reinforcement learning agents to follow complex temporal instructions in vision-based environments without prior knowledge of observation-symbol mappings, achieving effective zero-shot generalization.
Contribution
It introduces a joint training approach for multi-task policies and symbol grounders using neural reward machines, removing the need for predefined observation-symbol mappings.
Findings
Achieves performance comparable to true symbol grounding.
Outperforms state-of-the-art methods in sub-symbolic environments.
Enables zero-shot generalization to new instructions.
Abstract
In this work we address the problem of training a Reinforcement Learning agent to follow multiple temporally-extended instructions expressed in Linear Temporal Logic in sub-symbolic environments. Previous multi-task work has mostly relied on knowledge of the mapping between raw observations and symbols appearing in the formulae. We drop this unrealistic assumption by jointly training a multi-task policy and a symbol grounder with the same experience. The symbol grounder is trained only from raw observations and sparse rewards via Neural Reward Machines in a semi-supervised fashion. Experiments on vision-based environments show that our method achieves performance comparable to using the true symbol grounding and significantly outperforms state-of-the-art methods for sub-symbolic environments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
