Deep Reinforcement Learning with Surrogate Agent-Environment Interface

Song Wang; Yu Jing

arXiv:1709.03942·cs.LG·November 13, 2017

Deep Reinforcement Learning with Surrogate Agent-Environment Interface

Song Wang, Yu Jing

PDF

Open Access

TL;DR

This paper introduces the surrogate agent-environment interface (SAEI) in reinforcement learning, proposing a new algorithm PSADPG that enables continuous control of discrete actions, showing competitive performance in initial training stages.

Contribution

It presents the SAEI concept and the PSADPG algorithm, enabling continuous control of discrete actions and advancing reinforcement learning methods.

Findings

01

PSADPG achieves DQN-level performance in certain tasks.

02

SAEI provides a new framework for probabilistic agent-environment interaction.

03

PSADPG demonstrates effectiveness in initial training stages.

Abstract

In this paper, we propose surrogate agent-environment interface (SAEI) in reinforcement learning. We also state that learning based on probability surrogate agent-environment interface provides optimal policy of task agent-environment interface. We introduce surrogate probability action and develop the probability surrogate action deterministic policy gradient (PSADPG) algorithm based on SAEI. This algorithm enables continuous control of discrete action. The experiments show PSADPG achieves the performance of DQN in certain tasks with the stochastic optimal policy nature in the initial training stage.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Evolutionary Algorithms and Applications

MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network