Improving Policy Optimization with Generalist-Specialist Learning

Zhiwei Jia; Xuanlin Li; Zhan Ling; Shuang Liu; Yiran Wu; Hao Su

arXiv:2206.12984·cs.LG·June 28, 2022·5 cites

Improving Policy Optimization with Generalist-Specialist Learning

Zhiwei Jia, Xuanlin Li, Zhan Ling, Shuang Liu, Yiran Wu, Hao Su

PDF

Open Access 1 Repo

TL;DR

This paper introduces a generalist-specialist training framework in deep reinforcement learning that combines broad initial training with targeted specialist fine-tuning, enhancing performance on diverse benchmarks.

Contribution

It proposes a novel training approach that integrates generalist and specialist policies, improving generalization and performance in complex environments.

Findings

01

The framework improves policy learning on Procgen, Meta-World, and ManiSkill benchmarks.

02

Specialist training accelerates learning and boosts final performance.

03

Auxiliary rewards from specialists enhance generalist training.

Abstract

Generalization in deep reinforcement learning over unseen environment variations usually requires policy learning over a large set of diverse training variations. We empirically observe that an agent trained on many variations (a generalist) tends to learn faster at the beginning, yet its performance plateaus at a less optimal level for a long time. In contrast, an agent trained only on a few variations (a specialist) can often achieve high returns under a limited computational budget. To have the best of both worlds, we propose a novel generalist-specialist training framework. Specifically, we first train a generalist on all environment variations; when it fails to improve, we launch a large population of specialists with weights cloned from the generalist, each trained to master a selected small subset of variations. We finally resume the training of the generalist with auxiliary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

seanjia/gsl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification · Evolutionary Algorithms and Applications