POAR: Efficient Policy Optimization via Online Abstract State   Representation Learning

Zhaorun Chen; Siqi Fan; Yuan Tan; Liang Gong; Binhao Chen; Te Sun,; David Filliat; Natalia D\'iaz-Rodr\'iguez; Chengliang Liu

arXiv:2109.08642·cs.RO·December 12, 2023

POAR: Efficient Policy Optimization via Online Abstract State Representation Learning

Zhaorun Chen, Siqi Fan, Yuan Tan, Liang Gong, Binhao Chen, Te Sun,, David Filliat, Natalia D\'iaz-Rodr\'iguez, Chengliang Liu

PDF

Open Access 1 Repo

TL;DR

POAR introduces an integrated approach combining RL and SRL with dynamic loss weighting and expert demonstrations, significantly improving sample efficiency and performance in high-dimensional robotic tasks.

Contribution

The paper presents a novel algorithm that seamlessly integrates SRL into policy optimization, addressing overfitting and enhancing learning efficiency in complex environments.

Findings

01

POAR outperforms state-of-the-art RL algorithms in sample efficiency and rewards.

02

It effectively handles high-dimensional tasks and enables training real robots from scratch.

03

The method leverages expert demonstrations and real-time state graph monitoring.

Abstract

While the rapid progress of deep learning fuels end-to-end reinforcement learning (RL), direct application, especially in high-dimensional space like robotic scenarios still suffers from low sample efficiency. Therefore State Representation Learning (SRL) is proposed to specifically learn to encode task-relevant features from complex sensory data into low-dimensional states. However, the pervasive implementation of SRL is usually conducted by a decoupling strategy in which the observation-state mapping is learned separately, which is prone to over-fit. To handle such problem, we summarize the state-of-the-art (SOTA) SRL sub-tasks in previous works and present a new algorithm called Policy Optimization via Abstract Representation which integrates SRL into the policy optimization phase. Firstly, We engage RL loss to assist in updating SRL model so that the states can evolve to meet the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

billchan226/poar-srl-4-robot
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning

MethodsEntropy Regularization · Proximal Policy Optimization