Adaptive Reward Design for Reinforcement Learning
Minjae Kwon, Ingy ElSayed-Aly, and Lu Feng

TL;DR
This paper introduces an adaptive reward shaping method for reinforcement learning that uses Linear Temporal Logic to specify tasks, providing more informative feedback and improving learning efficiency in uncertain environments.
Contribution
It proposes a novel adaptive reward shaping approach that dynamically updates reward functions based on LTL specifications, enhancing RL performance over traditional sparse reward methods.
Findings
Outperforms baseline methods in benchmark environments
Achieves earlier convergence to higher-quality policies
Increases task completion rates and expected returns
Abstract
There is a surge of interest in using formal languages such as Linear Temporal Logic (LTL) to precisely and succinctly specify complex tasks and derive reward functions for Reinforcement Learning (RL). However, existing methods often assign sparse rewards (e.g., giving a reward of 1 only if a task is completed and 0 otherwise). By providing feedback solely upon task completion, these methods fail to encourage successful subtask completion. This is particularly problematic in environments with inherent uncertainty, where task completion may be unreliable despite progress on intermediate goals. To address this limitation, we propose a suite of reward functions that incentivize an RL agent to complete a task specified by an LTL formula as much as possible, and develop an adaptive reward shaping approach that dynamically updates reward functions during the learning process. Experimental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSupply Chain and Inventory Management
