Operator Deep Q-Learning: Zero-Shot Reward Transferring in Reinforcement Learning
Ziyang Tang, Yihao Feng, Qiang Liu

TL;DR
This paper introduces an operator neural network framework for reinforcement learning that enables zero-shot transfer of value functions to unseen reward functions, significantly improving adaptability in RL tasks.
Contribution
It proposes a novel operator neural network architecture for RL that allows direct approximation of reward-to-value operators, facilitating zero-shot reward transfer.
Findings
Operator networks outperform existing methods.
Framework enables zero-shot reward transfer.
Improves offline policy evaluation and optimization.
Abstract
Reinforcement learning (RL) has drawn increasing interests in recent years due to its tremendous success in various applications. However, standard RL algorithms can only be applied for single reward function, and cannot adapt to an unseen reward function quickly. In this paper, we advocate a general operator view of reinforcement learning, which enables us to directly approximate the operator that maps from reward function to value function. The benefit of learning the operator is that we can incorporate any new reward function as input and attain its corresponding value function in a zero-shot manner. To approximate this special type of operator, we design a number of novel operator neural network architectures based on its theoretical properties. Our design of operator networks outperform the existing methods and the standard design of general purpose operator network, and we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMuscle activation and electromyography studies · Reinforcement Learning in Robotics · EEG and Brain-Computer Interfaces
MethodsQ-Learning
