Efficient Reuse of Previous Experiences to Improve Policies in Real Environment
Norikazu Sugimoto, Voot Tangkaratt, Thijs Wensveen, Tingting Zhao,, Masashi Sugiyama, Jun Morimoto

TL;DR
This paper presents a method for efficiently improving robot movement policies in real environments by reusing previous experiences with importance-weighted PGPE, reducing the need for extensive trial-and-error learning.
Contribution
It introduces the application of importance-weighted PGPE to real humanoid robots, enabling effective policy learning without prior knowledge or extensive trials.
Findings
Successfully learned target reaching movement in real robot
Achieved cart-pole swing-up without prior task knowledge
Demonstrated efficient policy improvement using experience reuse
Abstract
In this study, we show that a movement policy can be improved efficiently using the previous experiences of a real robot. Reinforcement Learning (RL) is becoming a popular approach to acquire a nonlinear optimal policy through trial and error. However, it is considered very difficult to apply RL to real robot control since it usually requires many learning trials. Such trials cannot be executed in real environments because unrealistic time is necessary and the real system's durability is limited. Therefore, in this study, instead of executing many learning trials, we propose to use a recently developed RL algorithm, importance-weighted PGPE, by which the robot can efficiently reuse previously sampled data to improve it's policy parameters. We apply importance-weighted PGPE to CB-i, our real humanoid robot, and show that it can learn a target reaching movement and a cart-pole swing up…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robotic Locomotion and Control · Evolutionary Algorithms and Applications
