FORM: Learning Expressive and Transferable First-Order Logic Reward Machines
Leo Ardon, Daniel Furelos-Blanco, Roko Parac, Alessandra Russo

TL;DR
This paper introduces First-Order Reward Machines (FORMs), which use first-order logic for more expressive and transferable reward representations in reinforcement learning, enabling scalable learning and multi-agent collaboration.
Contribution
The paper proposes FORM, a novel first-order logic-based reward machine, along with a learning method and multi-agent framework to improve transferability and scalability in RL tasks.
Findings
FORMs are more compact and scalable than traditional RMs.
Effective learning of FORMs in complex tasks where traditional RMs fail.
Multi-agent approach enhances transferability and learning speed.
Abstract
Reward machines (RMs) are an effective approach for addressing non-Markovian rewards in reinforcement learning (RL) through finite-state machines. Traditional RMs, which label edges with propositional logic formulae, inherit the limited expressivity of propositional logic. This limitation hinders the learnability and transferability of RMs since complex tasks will require numerous states and edges. To overcome these challenges, we propose First-Order Reward Machines (s), which use first-order logic to label edges, resulting in more compact and transferable RMs. We introduce a novel method for s and a multi-agent formulation for them and facilitate their transferability, where multiple agents collaboratively learn policies for a shared . Our experimental results demonstrate the scalability of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
