Random Actions vs Random Policies: Bootstrapping Model-Based Direct   Policy Search

Elias Hanna; Alex Coninx; St\'ephane Doncieux

arXiv:2210.11801·cs.LG·October 24, 2022

Random Actions vs Random Policies: Bootstrapping Model-Based Direct Policy Search

Elias Hanna, Alex Coninx, St\'ephane Doncieux

PDF

Open Access

TL;DR

This paper investigates how different initial data collection methods affect the efficiency of model-based policy search, highlighting the importance of bootstrap strategies and potential hybrid approaches for improved learning.

Contribution

It compares initialization methods across two policy search frameworks, providing insights into their impact on model performance and suggesting avenues for hybrid method development.

Findings

01

Task-dependent factors can negatively affect each method

02

Probabilistic ensembles are used for dynamics modeling

03

Hybrid approaches may improve bootstrap efficiency

Abstract

This paper studies the impact of the initial data gathering method on the subsequent learning of a dynamics model. Dynamics models approximate the true transition function of a given task, in order to perform policy search directly on the model rather than on the costly real system. This study aims to determine how to bootstrap a model as efficiently as possible, by comparing initialization methods employed in two different policy search frameworks in the literature. The study focuses on the model performance under the episode-based framework of Evolutionary methods using probabilistic ensembles. Experimental results show that various task-dependant factors can be detrimental to each method, suggesting to explore hybrid approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Multi-Objective Optimization Algorithms