Efficient Reuse of Previous Experiences to Improve Policies in Real   Environment

Norikazu Sugimoto; Voot Tangkaratt; Thijs Wensveen; Tingting Zhao,; Masashi Sugiyama; Jun Morimoto

arXiv:1405.2406·cs.RO·May 13, 2014

Efficient Reuse of Previous Experiences to Improve Policies in Real Environment

Norikazu Sugimoto, Voot Tangkaratt, Thijs Wensveen, Tingting Zhao,, Masashi Sugiyama, Jun Morimoto

PDF

Open Access

TL;DR

This paper presents a method for efficiently improving robot movement policies in real environments by reusing previous experiences with importance-weighted PGPE, reducing the need for extensive trial-and-error learning.

Contribution

It introduces the application of importance-weighted PGPE to real humanoid robots, enabling effective policy learning without prior knowledge or extensive trials.

Findings

01

Successfully learned target reaching movement in real robot

02

Achieved cart-pole swing-up without prior task knowledge

03

Demonstrated efficient policy improvement using experience reuse

Abstract

In this study, we show that a movement policy can be improved efficiently using the previous experiences of a real robot. Reinforcement Learning (RL) is becoming a popular approach to acquire a nonlinear optimal policy through trial and error. However, it is considered very difficult to apply RL to real robot control since it usually requires many learning trials. Such trials cannot be executed in real environments because unrealistic time is necessary and the real system's durability is limited. Therefore, in this study, instead of executing many learning trials, we propose to use a recently developed RL algorithm, importance-weighted PGPE, by which the robot can efficiently reuse previously sampled data to improve it's policy parameters. We apply importance-weighted PGPE to CB-i, our real humanoid robot, and show that it can learn a target reaching movement and a cart-pole swing up…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robotic Locomotion and Control · Evolutionary Algorithms and Applications