Test-driven Reinforcement Learning in Continuous Control
Zhao Yu, Xiuping Wu, Liangjun Ke

TL;DR
This paper introduces Test-driven Reinforcement Learning (TdRL), a novel framework that uses multiple test functions instead of a single reward to define tasks, simplifying reward design and improving policy learning in continuous control tasks.
Contribution
The paper proposes a new TdRL framework that employs test functions for task definition, along with theoretical proofs and an algorithm, enhancing reward design and multi-objective optimization in RL.
Findings
TdRL matches or outperforms handcrafted reward methods in benchmarks.
TdRL simplifies task specification and supports multi-objective optimization.
Theoretical proof links trajectory returns to policy optimality.
Abstract
Reinforcement learning (RL) has been recognized as a powerful tool for robot control tasks. RL typically employs reward functions to define task objectives and guide agent learning. However, since the reward function serves the dual purpose of defining the optimal goal and guiding learning, it is challenging to design the reward function manually, which often results in a suboptimal task representation. To tackle the reward design challenge in RL, inspired by the satisficing theory, we propose a Test-driven Reinforcement Learning (TdRL) framework. In the TdRL framework, multiple test functions are used to represent the task objective rather than a single reward function. Test functions can be categorized as pass-fail tests and indicative tests, each dedicated to defining the optimal objective and guiding the learning process, respectively, thereby making defining tasks easier. Building…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Autonomous Vehicle Technology and Safety
