Shaping in Practice: Training Wheels to Learn Fast Hopping Directly in Hardware
Steve Heim, Felix Ruppert, Alborz A. Sarvestani, Alexander Spr\"owitz

TL;DR
This paper introduces the concept of training wheels—temporary hardware modifications—to facilitate fast learning in robotic hopping tasks directly in hardware, addressing instability and contact challenges.
Contribution
It presents a novel approach of shaping the reward landscape with training wheels, providing empirical insights and criteria for designing effective hardware modifications for robotic learning.
Findings
Training wheels improve learning stability in hardware robotics.
Empirical mapping of reward landscape aids in designing training wheels.
Proposed criteria guide effective hardware modifications for learning.
Abstract
Learning instead of designing robot controllers can greatly reduce engineering effort required, while also emphasizing robustness. Despite considerable progress in simulation, applying learning directly in hardware is still challenging, in part due to the necessity to explore potentially unstable parameters. We explore the concept of shaping the reward landscape with training wheels: temporary modifications of the physical hardware that facilitate learning. We demonstrate the concept with a robot leg mounted on a boom learning to hop fast. This proof of concept embodies typical challenges such as instability and contact, while being simple enough to empirically map out and visualize the reward landscape. Based on our results we propose three criteria for designing effective training wheels for learning in robotics. A video synopsis can be found at https://youtu.be/6iH5E3LrYh8.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
