Deep Reinforcement Learning with Surrogate Agent-Environment Interface
Song Wang, Yu Jing

TL;DR
This paper introduces the surrogate agent-environment interface (SAEI) in reinforcement learning, proposing a new algorithm PSADPG that enables continuous control of discrete actions, showing competitive performance in initial training stages.
Contribution
It presents the SAEI concept and the PSADPG algorithm, enabling continuous control of discrete actions and advancing reinforcement learning methods.
Findings
PSADPG achieves DQN-level performance in certain tasks.
SAEI provides a new framework for probabilistic agent-environment interaction.
PSADPG demonstrates effectiveness in initial training stages.
Abstract
In this paper, we propose surrogate agent-environment interface (SAEI) in reinforcement learning. We also state that learning based on probability surrogate agent-environment interface provides optimal policy of task agent-environment interface. We introduce surrogate probability action and develop the probability surrogate action deterministic policy gradient (PSADPG) algorithm based on SAEI. This algorithm enables continuous control of discrete action. The experiments show PSADPG achieves the performance of DQN in certain tasks with the stochastic optimal policy nature in the initial training stage.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Evolutionary Algorithms and Applications
MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network
