Evolving Pareto-Optimal Actor-Critic Algorithms for Generalizability and   Stability

Juan Jose Garau-Luis; Yingjie Miao; John D. Co-Reyes; Aaron Parisi,; Jie Tan; Esteban Real; Aleksandra Faust

arXiv:2204.04292·cs.LG·April 26, 2023

Evolving Pareto-Optimal Actor-Critic Algorithms for Generalizability and Stability

Juan Jose Garau-Luis, Yingjie Miao, John D. Co-Reyes, Aaron Parisi,, Jie Tan, Esteban Real, Aleksandra Faust

PDF

Open Access

TL;DR

MetaPG is an evolutionary approach that automatically designs actor-critic algorithms, significantly improving their generalizability and stability for real-world reinforcement learning tasks.

Contribution

It introduces MetaPG, a method that evolves actor-critic loss functions focusing on generalizability, stability, and performance, outperforming SAC in various environments.

Findings

01

MetaPG improves generalizability by 20% over SAC.

02

MetaPG reduces instability by up to 67%.

03

Evolved algorithms perform well across different environments and conditions.

Abstract

Generalizability and stability are two key objectives for operating reinforcement learning (RL) agents in the real world. Designing RL algorithms that optimize these objectives can be a costly and painstaking process. This paper presents MetaPG, an evolutionary method for automated design of actor-critic loss functions. MetaPG explicitly optimizes for generalizability and performance, and implicitly optimizes the stability of both metrics. We initialize our loss function population with Soft Actor-Critic (SAC) and perform multi-objective optimization using fitness metrics encoding single-task performance, zero-shot generalizability to unseen environment configurations, and stability across independent runs with different random seeds. On a set of continuous control tasks from the Real-World RL Benchmark Suite, we find that our method, using a single environment during evolution, evolves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Metaheuristic Optimization Algorithms Research