POAR: Efficient Policy Optimization via Online Abstract State Representation Learning
Zhaorun Chen, Siqi Fan, Yuan Tan, Liang Gong, Binhao Chen, Te Sun,, David Filliat, Natalia D\'iaz-Rodr\'iguez, Chengliang Liu

TL;DR
POAR introduces an integrated approach combining RL and SRL with dynamic loss weighting and expert demonstrations, significantly improving sample efficiency and performance in high-dimensional robotic tasks.
Contribution
The paper presents a novel algorithm that seamlessly integrates SRL into policy optimization, addressing overfitting and enhancing learning efficiency in complex environments.
Findings
POAR outperforms state-of-the-art RL algorithms in sample efficiency and rewards.
It effectively handles high-dimensional tasks and enables training real robots from scratch.
The method leverages expert demonstrations and real-time state graph monitoring.
Abstract
While the rapid progress of deep learning fuels end-to-end reinforcement learning (RL), direct application, especially in high-dimensional space like robotic scenarios still suffers from low sample efficiency. Therefore State Representation Learning (SRL) is proposed to specifically learn to encode task-relevant features from complex sensory data into low-dimensional states. However, the pervasive implementation of SRL is usually conducted by a decoupling strategy in which the observation-state mapping is learned separately, which is prone to over-fit. To handle such problem, we summarize the state-of-the-art (SOTA) SRL sub-tasks in previous works and present a new algorithm called Policy Optimization via Abstract Representation which integrates SRL into the policy optimization phase. Firstly, We engage RL loss to assist in updating SRL model so that the states can evolve to meet the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
MethodsEntropy Regularization · Proximal Policy Optimization
