Shaping in Practice: Training Wheels to Learn Fast Hopping Directly in   Hardware

Steve Heim; Felix Ruppert; Alborz A. Sarvestani; Alexander Spr\"owitz

arXiv:1709.10273·cs.RO·March 7, 2022

Shaping in Practice: Training Wheels to Learn Fast Hopping Directly in Hardware

Steve Heim, Felix Ruppert, Alborz A. Sarvestani, Alexander Spr\"owitz

PDF

TL;DR

This paper introduces the concept of training wheels—temporary hardware modifications—to facilitate fast learning in robotic hopping tasks directly in hardware, addressing instability and contact challenges.

Contribution

It presents a novel approach of shaping the reward landscape with training wheels, providing empirical insights and criteria for designing effective hardware modifications for robotic learning.

Findings

01

Training wheels improve learning stability in hardware robotics.

02

Empirical mapping of reward landscape aids in designing training wheels.

03

Proposed criteria guide effective hardware modifications for learning.

Abstract

Learning instead of designing robot controllers can greatly reduce engineering effort required, while also emphasizing robustness. Despite considerable progress in simulation, applying learning directly in hardware is still challenging, in part due to the necessity to explore potentially unstable parameters. We explore the concept of shaping the reward landscape with training wheels: temporary modifications of the physical hardware that facilitate learning. We demonstrate the concept with a robot leg mounted on a boom learning to hop fast. This proof of concept embodies typical challenges such as instability and contact, while being simple enough to empirically map out and visualize the reward landscape. Based on our results we propose three criteria for designing effective training wheels for learning in robotics. A video synopsis can be found at https://youtu.be/6iH5E3LrYh8.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.